linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections
@ 2015-09-16 21:23 Matthew R. Ochs
  2015-09-16 21:25 ` [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
                   ` (29 more replies)
  0 siblings, 30 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:23 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev

This patch set contains various fixes and corrections for issues that
were found during test and code review. The series is based upon the
code upstreamed in 4.3 and is intended for the rc phase. The entire
set is bisectable. Please reference the changelog below for details
on what has been altered from previous versions of this patch set.

v2 Changes:
- Incorporate comments from Ian Munsie
- Rework commit messages to be more descriptive
- Add state change serialization patch

Manoj Kumar (3):
  cxlflash: Fix to avoid invalid port_sel value
  cxlflash: Replace magic numbers with literals
  cxlflash: Fix read capacity timeout

Matthew R. Ochs (27):
  cxlflash: Fix potential oops following LUN removal
  cxlflash: Fix data corruption when vLUN used over multiple cards
  cxlflash: Fix to avoid sizeof(bool)
  cxlflash: Fix context encode mask width
  cxlflash: Fix to avoid CXL services during EEH
  cxlflash: Check for removal when processing interrupt
  cxlflash: Correct naming of limbo state and waitq
  cxlflash: Make functions static
  cxlflash: Refine host/device attributes
  cxlflash: Fix to avoid spamming the kernel log
  cxlflash: Fix to avoid stall while waiting on TMF
  cxlflash: Fix location of setting resid
  cxlflash: Fix host link up event handling
  cxlflash: Fix async interrupt bypass logic
  cxlflash: Remove dual port online dependency
  cxlflash: Fix AFU version access/storage and add check
  cxlflash: Correct usage of scsi_host_put()
  cxlflash: Fix to prevent workq from accessing freed memory
  cxlflash: Correct behavior in device reset handler following EEH
  cxlflash: Remove unnecessary scsi_block_requests
  cxlflash: Fix function prolog parameters and return codes
  cxlflash: Fix MMIO and endianness errors
  cxlflash: Fix to prevent EEH recovery failure
  cxlflash: Correct spelling, grammar, and alignment mistakes
  cxlflash: Fix to prevent stale AFU RRQ
  cxlflash: Fix to avoid state change collision
  MAINTAINERS: Add cxlflash driver

 MAINTAINERS                       |    9 +
 drivers/scsi/cxlflash/common.h    |   29 +-
 drivers/scsi/cxlflash/lunmgt.c    |    9 +-
 drivers/scsi/cxlflash/main.c      | 1575 ++++++++++++++++++++-----------------
 drivers/scsi/cxlflash/main.h      |    1 +
 drivers/scsi/cxlflash/sislite.h   |    8 +-
 drivers/scsi/cxlflash/superpipe.c |  177 +++--
 drivers/scsi/cxlflash/superpipe.h |   11 +-
 drivers/scsi/cxlflash/vlun.c      |   39 +-
 9 files changed, 1036 insertions(+), 822 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
@ 2015-09-16 21:25 ` Matthew R. Ochs
  2015-09-18  1:16   ` Brian King
  2015-09-16 21:26 ` [PATCH v2 02/30] cxlflash: Replace magic numbers with literals Matthew R. Ochs
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:25 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

From: Manoj Kumar <kumarmn@us.ibm.com>

If two concurrent MANAGE_LUN ioctls are issued with the same
WWID parameter, it would result in an incorrect value of port_sel.

This is because port_sel is modified without any locks being
held. If the first caller stalls after the return from
find_and_create_lun(), the value of port_sel will be set
incorrectly to indicate a single port, though in this case
it should have been set to both ports.

To fix, use the global mutex to serialize the lookup of the
WWID and the subsequent modification of port_sel.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/lunmgt.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/lunmgt.c b/drivers/scsi/cxlflash/lunmgt.c
index d98ad0f..8c372fc 100644
--- a/drivers/scsi/cxlflash/lunmgt.c
+++ b/drivers/scsi/cxlflash/lunmgt.c
@@ -120,7 +120,8 @@ static struct glun_info *lookup_global(u8 *wwid)
  *
  * The LUN is kept both in a local list (per adapter) and in a global list
  * (across all adapters). Certain attributes of the LUN are local to the
- * adapter (such as index, port selection mask etc.).
+ * adapter (such as index, port selection mask, etc.).
+ *
  * The block allocation map is shared across all adapters (i.e. associated
  * wih the global list). Since different attributes are associated with
  * the per adapter and global entries, allocate two separate structures for each
@@ -128,6 +129,8 @@ static struct glun_info *lookup_global(u8 *wwid)
  *
  * Keep a pointer back from the local to the global entry.
  *
+ * This routine assumes the caller holds the global mutex.
+ *
  * Return: Found/Allocated local lun_info structure on success, NULL on failure
  */
 static struct llun_info *find_and_create_lun(struct scsi_device *sdev, u8 *wwid)
@@ -137,7 +140,6 @@ static struct llun_info *find_and_create_lun(struct scsi_device *sdev, u8 *wwid)
 	struct Scsi_Host *shost = sdev->host;
 	struct cxlflash_cfg *cfg = shost_priv(shost);
 
-	mutex_lock(&global.mutex);
 	if (unlikely(!wwid))
 		goto out;
 
@@ -169,7 +171,6 @@ static struct llun_info *find_and_create_lun(struct scsi_device *sdev, u8 *wwid)
 	list_add(&gli->list, &global.gluns);
 
 out:
-	mutex_unlock(&global.mutex);
 	pr_debug("%s: returning %p\n", __func__, lli);
 	return lli;
 }
@@ -235,6 +236,7 @@ int cxlflash_manage_lun(struct scsi_device *sdev,
 	u64 flags = manage->hdr.flags;
 	u32 chan = sdev->channel;
 
+	mutex_lock(&global.mutex);
 	lli = find_and_create_lun(sdev, manage->wwid);
 	pr_debug("%s: ENTER: WWID = %016llX%016llX, flags = %016llX li = %p\n",
 		 __func__, get_unaligned_le64(&manage->wwid[0]),
@@ -261,6 +263,7 @@ int cxlflash_manage_lun(struct scsi_device *sdev,
 	}
 
 out:
+	mutex_unlock(&global.mutex);
 	pr_debug("%s: returning rc=%d\n", __func__, rc);
 	return rc;
 }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 02/30] cxlflash: Replace magic numbers with literals
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
  2015-09-16 21:25 ` [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
@ 2015-09-16 21:26 ` Matthew R. Ochs
  2015-09-18  1:18   ` Brian King
  2015-09-16 21:26 ` [PATCH v2 03/30] cxlflash: Fix read capacity timeout Matthew R. Ochs
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:26 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

From: Manoj Kumar <kumarmn@us.ibm.com>

Magic numbers are not meaningful and can create confusion. As a
remedy, replace them with descriptive literals.

Replace 512 with literal MAX_SECTOR_UNIT.
Replace 5 with literal CMD_RETRIES.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 6 ++++--
 drivers/scsi/cxlflash/superpipe.h | 3 +++
 drivers/scsi/cxlflash/vlun.c      | 3 ++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index f1b62ce..7df985d 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -315,7 +315,8 @@ retry:
 		retry_cnt ? "re" : "", scsi_cmd[0]);
 
 	result = scsi_execute(sdev, scsi_cmd, DMA_FROM_DEVICE, cmd_buf,
-			      CMD_BUFSIZE, sense_buf, tout, 5, 0, NULL);
+			      CMD_BUFSIZE, sense_buf, tout, CMD_RETRIES,
+			      0, NULL);
 
 	if (driver_byte(result) == DRIVER_SENSE) {
 		result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
@@ -1375,7 +1376,8 @@ out_attach:
 	attach->block_size = gli->blk_len;
 	attach->mmio_size = sizeof(afu->afu_map->hosts[0].harea);
 	attach->last_lba = gli->max_lba;
-	attach->max_xfer = (sdev->host->max_sectors * 512) / gli->blk_len;
+	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT) /
+		gli->blk_len;
 
 out:
 	attach->adap_fd = fd;
diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index d7dc88b..3f7856b 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -29,6 +29,9 @@ extern struct cxlflash_global global;
 #define MC_CHUNK_SIZE     (1 << MC_RHT_NMASK)	/* in LBAs */
 
 #define MC_DISCOVERY_TIMEOUT 5  /* 5 secs */
+#define CMD_RETRIES 5   /* 5 retries for scsi_execute */
+
+#define MAX_SECTOR_UNIT  512 /* max_sector is in 512 byte multiples */
 
 #define CHAN2PORT(_x)	((_x) + 1)
 #define PORT2CHAN(_x)	((_x) - 1)
diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index 6155cb1..6d6608b 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -434,7 +434,8 @@ static int write_same16(struct scsi_device *sdev,
 				   &scsi_cmd[10]);
 
 		result = scsi_execute(sdev, scsi_cmd, DMA_TO_DEVICE, cmd_buf,
-				      CMD_BUFSIZE, sense_buf, tout, 5, 0, NULL);
+				      CMD_BUFSIZE, sense_buf, tout, CMD_RETRIES,
+				      0, NULL);
 		if (result) {
 			dev_err_ratelimited(dev, "%s: command failed for "
 					    "offset %lld result=0x%x\n",
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 03/30] cxlflash: Fix read capacity timeout
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
  2015-09-16 21:25 ` [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
  2015-09-16 21:26 ` [PATCH v2 02/30] cxlflash: Replace magic numbers with literals Matthew R. Ochs
@ 2015-09-16 21:26 ` Matthew R. Ochs
  2015-09-18  1:21   ` Brian King
  2015-09-21 11:36   ` Tomas Henzl
  2015-09-16 21:27 ` [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
                   ` (26 subsequent siblings)
  29 siblings, 2 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:26 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

From: Manoj Kumar <kumarmn@us.ibm.com>

The timeout value for read capacity is too small. Certain devices
may take longer to respond and thus the command may prematurely
timeout. Additionally the literal used for the timeout is stale.

Update the timeout to 30 seconds (matches the value used in sd.c)
and rework the timeout literal to a more appropriate description.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 9 ++++-----
 drivers/scsi/cxlflash/superpipe.h | 2 +-
 drivers/scsi/cxlflash/vlun.c      | 4 ++--
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 7df985d..fa513ba 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -296,7 +296,7 @@ static int read_cap16(struct scsi_device *sdev, struct llun_info *lli)
 	int rc = 0;
 	int result = 0;
 	int retry_cnt = 0;
-	u32 tout = (MC_DISCOVERY_TIMEOUT * HZ);
+	u32 to = (CMD_TIMEOUT * HZ);
 
 retry:
 	cmd_buf = kzalloc(CMD_BUFSIZE, GFP_KERNEL);
@@ -315,8 +315,7 @@ retry:
 		retry_cnt ? "re" : "", scsi_cmd[0]);
 
 	result = scsi_execute(sdev, scsi_cmd, DMA_FROM_DEVICE, cmd_buf,
-			      CMD_BUFSIZE, sense_buf, tout, CMD_RETRIES,
-			      0, NULL);
+			      CMD_BUFSIZE, sense_buf, to, CMD_RETRIES, 0, NULL);
 
 	if (driver_byte(result) == DRIVER_SENSE) {
 		result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
@@ -1376,8 +1375,8 @@ out_attach:
 	attach->block_size = gli->blk_len;
 	attach->mmio_size = sizeof(afu->afu_map->hosts[0].harea);
 	attach->last_lba = gli->max_lba;
-	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT) /
-		gli->blk_len;
+	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT);
+	attach->max_xfer /= gli->blk_len;
 
 out:
 	attach->adap_fd = fd;
diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index 3f7856b..fffb179 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -28,7 +28,7 @@ extern struct cxlflash_global global;
 */
 #define MC_CHUNK_SIZE     (1 << MC_RHT_NMASK)	/* in LBAs */
 
-#define MC_DISCOVERY_TIMEOUT 5  /* 5 secs */
+#define CMD_TIMEOUT 30  /* 30 secs */
 #define CMD_RETRIES 5   /* 5 retries for scsi_execute */
 
 #define MAX_SECTOR_UNIT  512 /* max_sector is in 512 byte multiples */
diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index 6d6608b..68994c4 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -414,7 +414,7 @@ static int write_same16(struct scsi_device *sdev,
 	int ws_limit = SISLITE_MAX_WS_BLOCKS;
 	u64 offset = lba;
 	int left = nblks;
-	u32 tout = sdev->request_queue->rq_timeout;
+	u32 to = sdev->request_queue->rq_timeout;
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)sdev->host->hostdata;
 	struct device *dev = &cfg->dev->dev;
 
@@ -434,7 +434,7 @@ static int write_same16(struct scsi_device *sdev,
 				   &scsi_cmd[10]);
 
 		result = scsi_execute(sdev, scsi_cmd, DMA_TO_DEVICE, cmd_buf,
-				      CMD_BUFSIZE, sense_buf, tout, CMD_RETRIES,
+				      CMD_BUFSIZE, sense_buf, to, CMD_RETRIES,
 				      0, NULL);
 		if (result) {
 			dev_err_ratelimited(dev, "%s: command failed for "
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (2 preceding siblings ...)
  2015-09-16 21:26 ` [PATCH v2 03/30] cxlflash: Fix read capacity timeout Matthew R. Ochs
@ 2015-09-16 21:27 ` Matthew R. Ochs
  2015-09-18  1:26   ` Brian King
  2015-09-21 12:11   ` Tomas Henzl
  2015-09-16 21:27 ` [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards Matthew R. Ochs
                   ` (25 subsequent siblings)
  29 siblings, 2 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:27 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

When a LUN is removed, the sdev that is associated with the LUN
remains intact until its reference count drops to 0. In order
to prevent an sdev from being removed while a context is still
associated with it, obtain an additional reference per-context
for each LUN attached to the context.

This resolves a potential Oops in the release handler when a
dealing with a LUN that has already been removed.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index fa513ba..1fa4af6 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -880,6 +880,9 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev,
 			sys_close(lfd);
 	}
 
+	/* Release the sdev reference that bound this LUN to the context */
+	scsi_device_put(sdev);
+
 out:
 	if (put_ctx)
 		put_context(ctxi);
@@ -1287,11 +1290,18 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
 			}
 	}
 
+	rc = scsi_device_get(sdev);
+	if (unlikely(rc)) {
+		dev_err(dev, "%s: Unable to get sdev reference!\n", __func__);
+		goto out;
+	}
+
 	lun_access = kzalloc(sizeof(*lun_access), GFP_KERNEL);
 	if (unlikely(!lun_access)) {
 		dev_err(dev, "%s: Unable to allocate lun_access!\n", __func__);
+		scsi_device_put(sdev);
 		rc = -ENOMEM;
-		goto out;
+		goto err0;
 	}
 
 	lun_access->lli = lli;
@@ -1311,21 +1321,21 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
 		dev_err(dev, "%s: Could not initialize context %p\n",
 			__func__, ctx);
 		rc = -ENODEV;
-		goto err0;
+		goto err1;
 	}
 
 	ctxid = cxl_process_element(ctx);
 	if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) {
 		dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid);
 		rc = -EPERM;
-		goto err1;
+		goto err2;
 	}
 
 	file = cxl_get_fd(ctx, &cfg->cxl_fops, &fd);
 	if (unlikely(fd < 0)) {
 		rc = -ENODEV;
 		dev_err(dev, "%s: Could not get file descriptor\n", __func__);
-		goto err1;
+		goto err2;
 	}
 
 	/* Translate read/write O_* flags from fcntl.h to AFU permission bits */
@@ -1335,7 +1345,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
 	if (unlikely(!ctxi)) {
 		dev_err(dev, "%s: Failed to create context! (%d)\n",
 			__func__, ctxid);
-		goto err2;
+		goto err3;
 	}
 
 	work = &ctxi->work;
@@ -1346,13 +1356,13 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
 	if (unlikely(rc)) {
 		dev_dbg(dev, "%s: Could not start context rc=%d\n",
 			__func__, rc);
-		goto err3;
+		goto err4;
 	}
 
 	rc = afu_attach(cfg, ctxi);
 	if (unlikely(rc)) {
 		dev_err(dev, "%s: Could not attach AFU rc %d\n", __func__, rc);
-		goto err4;
+		goto err5;
 	}
 
 	/*
@@ -1388,13 +1398,13 @@ out:
 		__func__, ctxid, fd, attach->block_size, rc, attach->last_lba);
 	return rc;
 
-err4:
+err5:
 	cxl_stop_context(ctx);
-err3:
+err4:
 	put_context(ctxi);
 	destroy_context(cfg, ctxi);
 	ctxi = NULL;
-err2:
+err3:
 	/*
 	 * Here, we're overriding the fops with a dummy all-NULL fops because
 	 * fput() calls the release fop, which will cause us to mistakenly
@@ -1406,10 +1416,12 @@ err2:
 	fput(file);
 	put_unused_fd(fd);
 	fd = -1;
-err1:
+err2:
 	cxl_release_context(ctx);
-err0:
+err1:
 	kfree(lun_access);
+err0:
+	scsi_device_put(sdev);
 	goto out;
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (3 preceding siblings ...)
  2015-09-16 21:27 ` [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
@ 2015-09-16 21:27 ` Matthew R. Ochs
  2015-09-18  1:28   ` Brian King
  2015-09-16 21:27 ` [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool) Matthew R. Ochs
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:27 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

If the same virtual LUN is accessed over multiple cards, only accesses
made over the first card will be valid. Accesses made over the second
card will go to the wrong LUN causing data corruption.

This is because the global LUN's mode word was being used to determine
whether the LUN table for that card needs to be programmed. The mode
word would be setup by the first card, causing the LUN table for the
second card to not be programmed.

By unconditionally initializing the LUN table (not depending on the
mode word), the problem is avoided.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/vlun.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index 68994c4..96b074f 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -915,16 +915,9 @@ int cxlflash_disk_virtual_open(struct scsi_device *sdev, void *arg)
 
 	pr_debug("%s: ctxid=%llu ls=0x%llx\n", __func__, ctxid, lun_size);
 
+	/* Setup the LUNs block allocator on first call */
 	mutex_lock(&gli->mutex);
 	if (gli->mode == MODE_NONE) {
-		/* Setup the LUN table and block allocator on first call */
-		rc = init_luntable(cfg, lli);
-		if (rc) {
-			dev_err(dev, "%s: call to init_luntable failed "
-				"rc=%d!\n", __func__, rc);
-			goto err0;
-		}
-
 		rc = init_vlun(lli);
 		if (rc) {
 			dev_err(dev, "%s: call to init_vlun failed rc=%d!\n",
@@ -942,6 +935,13 @@ int cxlflash_disk_virtual_open(struct scsi_device *sdev, void *arg)
 	}
 	mutex_unlock(&gli->mutex);
 
+	rc = init_luntable(cfg, lli);
+	if (rc) {
+		dev_err(dev, "%s: call to init_luntable failed rc=%d!\n",
+			__func__, rc);
+		goto err1;
+	}
+
 	ctxi = get_context(cfg, rctxid, lli, 0);
 	if (unlikely(!ctxi)) {
 		dev_err(dev, "%s: Bad context! (%llu)\n", __func__, ctxid);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool)
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (4 preceding siblings ...)
  2015-09-16 21:27 ` [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards Matthew R. Ochs
@ 2015-09-16 21:27 ` Matthew R. Ochs
  2015-09-18  1:29   ` Brian King
  2015-09-16 21:27 ` [PATCH v2 07/30] cxlflash: Fix context encode mask width Matthew R. Ochs
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:27 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Using sizeof(bool) is considered poor form for various reasons and
sparse warns us of that. Correct by changing type from bool to u8.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 2 +-
 drivers/scsi/cxlflash/superpipe.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 1fa4af6..cf2a85d 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -737,7 +737,7 @@ static struct ctx_info *create_context(struct cxlflash_cfg *cfg,
 	struct afu *afu = cfg->afu;
 	struct ctx_info *ctxi = NULL;
 	struct llun_info **lli = NULL;
-	bool *ws = NULL;
+	u8 *ws = NULL;
 	struct sisl_rht_entry *rhte;
 
 	ctxi = kzalloc(sizeof(*ctxi), GFP_KERNEL);
diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index fffb179..72d53cf 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -97,7 +97,7 @@ struct ctx_info {
 	u32 rht_out;		/* Number of checked out RHT entries */
 	u32 rht_perms;		/* User-defined permissions for RHT entries */
 	struct llun_info **rht_lun;       /* Mapping of RHT entries to LUNs */
-	bool *rht_needs_ws;	/* User-desired write-same function per RHTE */
+	u8 *rht_needs_ws;	/* User-desired write-same function per RHTE */
 
 	struct cxl_ioctl_start_work work;
 	u64 ctxid;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 07/30] cxlflash: Fix context encode mask width
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (5 preceding siblings ...)
  2015-09-16 21:27 ` [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool) Matthew R. Ochs
@ 2015-09-16 21:27 ` Matthew R. Ochs
  2015-09-18  1:29   ` Brian King
  2015-09-16 21:27 ` [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH Matthew R. Ochs
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:27 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The context encode mask covers more than 32-bits, making it
a long integer. This should be noted by appending the ULL
width suffix to the mask.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index 72d53cf..7947091 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -87,7 +87,7 @@ enum ctx_ctrl {
 	CTX_CTRL_FILE		= (1 << 5)
 };
 
-#define ENCODE_CTXID(_ctx, _id)	(((((u64)_ctx) & 0xFFFFFFFF0) << 28) | _id)
+#define ENCODE_CTXID(_ctx, _id)	(((((u64)_ctx) & 0xFFFFFFFF0ULL) << 28) | _id)
 #define DECODE_CTXID(_val)	(_val & 0xFFFFFFFF)
 
 struct ctx_info {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (6 preceding siblings ...)
  2015-09-16 21:27 ` [PATCH v2 07/30] cxlflash: Fix context encode mask width Matthew R. Ochs
@ 2015-09-16 21:27 ` Matthew R. Ochs
  2015-09-18 13:37   ` Brian King
  2015-09-16 21:28 ` [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove Matthew R. Ochs
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:27 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

During an EEH freeze event, certain CXL services should not be
called until after the hardware reset has taken place. Doing so
can result in unnecessary failures and possibly cause other ill
effects by triggering hardware accesses. This translates to a
requirement to quiesce all threads that may potentially use CXL
runtime service during this window. In particular, multiple ioctls
make use of the CXL services when acting on contexts on behalf of
the user. Thus, it is essential to 'drain' running ioctls _before_
proceeding with handling the EEH freeze event.

Create the ability to drain ioctls by wrapping the ioctl handler
call in a read semaphore and then implementing a small routine that
obtains the write semaphore, effectively creating a wait point for
all currently executing ioctls.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h    |   2 +
 drivers/scsi/cxlflash/main.c      |  18 +++++--
 drivers/scsi/cxlflash/superpipe.c | 104 +++++++++++++++++++++++---------------
 3 files changed, 81 insertions(+), 43 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 1c56037..1abe4e0 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -16,6 +16,7 @@
 #define _CXLFLASH_COMMON_H
 
 #include <linux/list.h>
+#include <linux/rwsem.h>
 #include <linux/types.h>
 #include <scsi/scsi.h>
 #include <scsi/scsi_device.h>
@@ -110,6 +111,7 @@ struct cxlflash_cfg {
 	atomic_t recovery_threads;
 	struct mutex ctx_recovery_mutex;
 	struct mutex ctx_tbl_list_mutex;
+	struct rw_semaphore ioctl_rwsem;
 	struct ctx_info *ctx_tbl[MAX_CONTEXT];
 	struct list_head ctx_err_recovery; /* contexts w/ recovery pending */
 	struct file_operations cxl_fops;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 3e3ccf1..6e85c77 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -2311,6 +2311,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	cfg->lr_port = -1;
 	mutex_init(&cfg->ctx_tbl_list_mutex);
 	mutex_init(&cfg->ctx_recovery_mutex);
+	init_rwsem(&cfg->ioctl_rwsem);
 	INIT_LIST_HEAD(&cfg->ctx_err_recovery);
 	INIT_LIST_HEAD(&cfg->lluns);
 
@@ -2365,6 +2366,19 @@ out_remove:
 }
 
 /**
+ * drain_ioctls() - wait until all currently executing ioctls have completed
+ * @cfg:	Internal structure associated with the host.
+ *
+ * Obtain write access to read/write semaphore that wraps ioctl
+ * handling to 'drain' ioctls currently executing.
+ */
+static void drain_ioctls(struct cxlflash_cfg *cfg)
+{
+	down_write(&cfg->ioctl_rwsem);
+	up_write(&cfg->ioctl_rwsem);
+}
+
+/**
  * cxlflash_pci_error_detected() - called when a PCI error is detected
  * @pdev:	PCI device struct.
  * @state:	PCI channel state.
@@ -2383,16 +2397,14 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
 	switch (state) {
 	case pci_channel_io_frozen:
 		cfg->state = STATE_LIMBO;
-
-		/* Turn off legacy I/O */
 		scsi_block_requests(cfg->host);
+		drain_ioctls(cfg);
 		rc = cxlflash_mark_contexts_error(cfg);
 		if (unlikely(rc))
 			dev_err(dev, "%s: Failed to mark user contexts!(%d)\n",
 				__func__, rc);
 		term_mc(cfg, UNDO_START);
 		stop_afu(cfg);
-
 		return PCI_ERS_RESULT_NEED_RESET;
 	case pci_channel_io_perm_failure:
 		cfg->state = STATE_FAILTERM;
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index cf2a85d..8a18230 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1214,6 +1214,48 @@ static const struct file_operations null_fops = {
 };
 
 /**
+ * check_state() - checks and responds to the current adapter state
+ * @cfg:	Internal structure associated with the host.
+ * @ioctl:	Indicates if on an ioctl thread.
+ *
+ * This routine can block and should only be used on process context.
+ * When blocking on an ioctl thread, the ioctl read semaphore should be
+ * let up to allow for draining actively running ioctls. Also note that
+ * when waking up from waiting in reset, the state is unknown and must
+ * be checked again before proceeding.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
+{
+	struct device *dev = &cfg->dev->dev;
+	int rc = 0;
+
+retry:
+	switch (cfg->state) {
+	case STATE_LIMBO:
+		dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__);
+		if (ioctl)
+			up_read(&cfg->ioctl_rwsem);
+		rc = wait_event_interruptible(cfg->limbo_waitq,
+					      cfg->state != STATE_LIMBO);
+		if (ioctl)
+			down_read(&cfg->ioctl_rwsem);
+		if (unlikely(rc))
+			break;
+		goto retry;
+	case STATE_FAILTERM:
+		dev_dbg(dev, "%s: Failed/Terminating!\n", __func__);
+		rc = -ENODEV;
+		break;
+	default:
+		break;
+	}
+
+	return rc;
+}
+
+/**
  * cxlflash_disk_attach() - attach a LUN to a context
  * @sdev:	SCSI device associated with LUN.
  * @attach:	Attach ioctl data structure.
@@ -1524,41 +1566,6 @@ err1:
 }
 
 /**
- * check_state() - checks and responds to the current adapter state
- * @cfg:	Internal structure associated with the host.
- *
- * This routine can block and should only be used on process context.
- * Note that when waking up from waiting in limbo, the state is unknown
- * and must be checked again before proceeding.
- *
- * Return: 0 on success, -errno on failure
- */
-static int check_state(struct cxlflash_cfg *cfg)
-{
-	struct device *dev = &cfg->dev->dev;
-	int rc = 0;
-
-retry:
-	switch (cfg->state) {
-	case STATE_LIMBO:
-		dev_dbg(dev, "%s: Limbo, going to wait...\n", __func__);
-		rc = wait_event_interruptible(cfg->limbo_waitq,
-					      cfg->state != STATE_LIMBO);
-		if (unlikely(rc))
-			break;
-		goto retry;
-	case STATE_FAILTERM:
-		dev_dbg(dev, "%s: Failed/Terminating!\n", __func__);
-		rc = -ENODEV;
-		break;
-	default:
-		break;
-	}
-
-	return rc;
-}
-
-/**
  * cxlflash_afu_recover() - initiates AFU recovery
  * @sdev:	SCSI device associated with LUN.
  * @recover:	Recover ioctl data structure.
@@ -1647,12 +1654,17 @@ retry_recover:
 	/* Test if in error state */
 	reg = readq_be(&afu->ctrl_map->mbox_r);
 	if (reg == -1) {
-		dev_dbg(dev, "%s: MMIO read fail! Wait for recovery...\n",
-			__func__);
-		mutex_unlock(&ctxi->mutex);
+		dev_dbg(dev, "%s: MMIO fail, wait for recovery.\n", __func__);
+
+		/*
+		 * Before checking the state, put back the context obtained with
+		 * get_context() as it is no longer needed and sleep for a short
+		 * period of time (see prolog notes).
+		 */
+		put_context(ctxi);
 		ctxi = NULL;
 		ssleep(1);
-		rc = check_state(cfg);
+		rc = check_state(cfg, true);
 		if (unlikely(rc))
 			goto out;
 		goto retry;
@@ -1946,7 +1958,7 @@ static int ioctl_common(struct scsi_device *sdev, int cmd)
 		goto out;
 	}
 
-	rc = check_state(cfg);
+	rc = check_state(cfg, true);
 	if (unlikely(rc) && (cfg->state == STATE_FAILTERM)) {
 		switch (cmd) {
 		case DK_CXLFLASH_VLUN_RESIZE:
@@ -1968,6 +1980,14 @@ out:
  * @cmd:	IOCTL command.
  * @arg:	Userspace ioctl data structure.
  *
+ * A read/write semaphore is used to implement a 'drain' of currently
+ * running ioctls. The read semaphore is taken at the beginning of each
+ * ioctl thread and released upon concluding execution. Additionally the
+ * semaphore should be released and then reacquired in any ioctl execution
+ * path which will wait for an event to occur that is outside the scope of
+ * the ioctl (i.e. an adapter reset). To drain the ioctls currently running,
+ * a thread simply needs to acquire the write semaphore.
+ *
  * Return: 0 on success, -errno on failure
  */
 int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg)
@@ -2002,6 +2022,9 @@ int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg)
 	{sizeof(struct dk_cxlflash_clone), (sioctl)cxlflash_disk_clone},
 	};
 
+	/* Hold read semaphore so we can drain if needed */
+	down_read(&cfg->ioctl_rwsem);
+
 	/* Restrict command set to physical support only for internal LUN */
 	if (afu->internal_lun)
 		switch (cmd) {
@@ -2083,6 +2106,7 @@ int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg)
 	/* fall through to exit */
 
 cxlflash_ioctl_exit:
+	up_read(&cfg->ioctl_rwsem);
 	if (unlikely(rc && known_ioctl))
 		dev_err(dev, "%s: ioctl %s (%08X) on dev(%d/%d/%d/%llu) "
 			"returned rc %d\n", __func__,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (7 preceding siblings ...)
  2015-09-16 21:27 ` [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH Matthew R. Ochs
@ 2015-09-16 21:28 ` Matthew R. Ochs
  2015-09-17 11:58   ` David Laight
  2015-09-16 21:28 ` [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq Matthew R. Ochs
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:28 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Interrupt processing can run in parallel to a remove operation. This
can lead to a condition where the interrupt handler is processing with
memory that has been freed.
    
To avoid processing an interrupt while memory may be yanked, check for
removal while in the interrupt handler. Bail when removal is imminent.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  2 ++
 drivers/scsi/cxlflash/main.c   | 21 +++++++++++++++------
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 1abe4e0..03d2cc6 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -103,6 +103,8 @@ struct cxlflash_cfg {
 	enum cxlflash_lr_state lr_state;
 	int lr_port;
 
+	atomic_t remove_active;
+
 	struct cxl_afu *cxl_afu;
 
 	struct pci_pool *cxlflash_cmd_pool;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 6e85c77..89ee648 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -892,6 +892,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
 	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
 
 	cfg->state = STATE_FAILTERM;
+	atomic_inc(&cfg->remove_active);
 	cxlflash_stop_term_user_contexts(cfg);
 
 	switch (cfg->init_state) {
@@ -1380,16 +1381,20 @@ static void afu_err_intr_init(struct afu *afu)
 static irqreturn_t cxlflash_sync_err_irq(int irq, void *data)
 {
 	struct afu *afu = (struct afu *)data;
+	struct cxlflash_cfg *cfg = afu->parent;
 	u64 reg;
 	u64 reg_unmasked;
 
+	if (atomic_read(&cfg->remove_active))
+		goto out;
+
 	reg = readq_be(&afu->host_map->intr_status);
 	reg_unmasked = (reg & SISL_ISTATUS_UNMASK);
 
 	if (reg_unmasked == 0UL) {
 		pr_err("%s: %llX: spurious interrupt, intr_status %016llX\n",
 		       __func__, (u64)afu, reg);
-		goto cxlflash_sync_err_irq_exit;
+		goto out;
 	}
 
 	pr_err("%s: %llX: unexpected interrupt, intr_status %016llX\n",
@@ -1397,7 +1402,7 @@ static irqreturn_t cxlflash_sync_err_irq(int irq, void *data)
 
 	writeq_be(reg_unmasked, &afu->host_map->intr_clear);
 
-cxlflash_sync_err_irq_exit:
+out:
 	pr_debug("%s: returning rc=%d\n", __func__, IRQ_HANDLED);
 	return IRQ_HANDLED;
 }
@@ -1412,6 +1417,7 @@ cxlflash_sync_err_irq_exit:
 static irqreturn_t cxlflash_rrq_irq(int irq, void *data)
 {
 	struct afu *afu = (struct afu *)data;
+	struct cxlflash_cfg *cfg = afu->parent;
 	struct afu_cmd *cmd;
 	bool toggle = afu->toggle;
 	u64 entry,
@@ -1421,8 +1427,10 @@ static irqreturn_t cxlflash_rrq_irq(int irq, void *data)
 
 	/* Process however many RRQ entries that are ready */
 	while (true) {
-		entry = *hrrq_curr;
+		if (atomic_read(&cfg->remove_active))
+			goto out;
 
+		entry = *hrrq_curr;
 		if ((entry & SISL_RESP_HANDLE_T_BIT) != toggle)
 			break;
 
@@ -1440,7 +1448,7 @@ static irqreturn_t cxlflash_rrq_irq(int irq, void *data)
 
 	afu->hrrq_curr = hrrq_curr;
 	afu->toggle = toggle;
-
+out:
 	return IRQ_HANDLED;
 }
 
@@ -1454,7 +1462,7 @@ static irqreturn_t cxlflash_rrq_irq(int irq, void *data)
 static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 {
 	struct afu *afu = (struct afu *)data;
-	struct cxlflash_cfg *cfg;
+	struct cxlflash_cfg *cfg = afu->parent;
 	u64 reg_unmasked;
 	const struct asyc_intr_info *info;
 	struct sisl_global_map *global = &afu->afu_map->global;
@@ -1462,7 +1470,8 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 	u8 port;
 	int i;
 
-	cfg = afu->parent;
+	if (atomic_read(&cfg->remove_active))
+		goto out;
 
 	reg = readq_be(&global->regs.aintr_status);
 	reg_unmasked = (reg & SISL_ASTATUS_UNMASK);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (8 preceding siblings ...)
  2015-09-16 21:28 ` [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove Matthew R. Ochs
@ 2015-09-16 21:28 ` Matthew R. Ochs
  2015-09-18 15:28   ` Brian King
  2015-09-16 21:28 ` [PATCH v2 11/30] cxlflash: Make functions static Matthew R. Ochs
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:28 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Limbo is not an accurate representation of this state and is
also not consistent with the terminology that other drivers
use to represent this concept. Rename the state and and its
associated waitq to 'reset'.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h    |  4 ++--
 drivers/scsi/cxlflash/main.c      | 26 +++++++++++++-------------
 drivers/scsi/cxlflash/superpipe.c | 14 +++++++-------
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 03d2cc6..6e0be53 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -79,7 +79,7 @@ enum cxlflash_init_state {
 
 enum cxlflash_state {
 	STATE_NORMAL,	/* Normal running state, everything good */
-	STATE_LIMBO,	/* Limbo running state, trying to reset/recover */
+	STATE_RESET,	/* Reset state, trying to reset/recover */
 	STATE_FAILTERM	/* Failed/terminating state, error out users/threads */
 };
 
@@ -127,7 +127,7 @@ struct cxlflash_cfg {
 
 	wait_queue_head_t tmf_waitq;
 	bool tmf_active;
-	wait_queue_head_t limbo_waitq;
+	wait_queue_head_t reset_waitq;
 	enum cxlflash_state state;
 };
 
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 89ee648..01b7f3e 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -382,8 +382,8 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
 
 	switch (cfg->state) {
-	case STATE_LIMBO:
-		dev_dbg_ratelimited(&cfg->dev->dev, "%s: device in limbo!\n",
+	case STATE_RESET:
+		dev_dbg_ratelimited(&cfg->dev->dev, "%s: device is in reset!\n",
 				    __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
@@ -479,8 +479,8 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 		if (unlikely(rcr))
 			rc = FAILED;
 		break;
-	case STATE_LIMBO:
-		wait_event(cfg->limbo_waitq, cfg->state != STATE_LIMBO);
+	case STATE_RESET:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
 		if (cfg->state == STATE_NORMAL)
 			break;
 		/* fall through */
@@ -519,7 +519,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 
 	switch (cfg->state) {
 	case STATE_NORMAL:
-		cfg->state = STATE_LIMBO;
+		cfg->state = STATE_RESET;
 		scsi_block_requests(cfg->host);
 		cxlflash_mark_contexts_error(cfg);
 		rcr = cxlflash_afu_reset(cfg);
@@ -528,11 +528,11 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 			cfg->state = STATE_FAILTERM;
 		} else
 			cfg->state = STATE_NORMAL;
-		wake_up_all(&cfg->limbo_waitq);
+		wake_up_all(&cfg->reset_waitq);
 		scsi_unblock_requests(cfg->host);
 		break;
-	case STATE_LIMBO:
-		wait_event(cfg->limbo_waitq, cfg->state != STATE_LIMBO);
+	case STATE_RESET:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
 		if (cfg->state == STATE_NORMAL)
 			break;
 		/* fall through */
@@ -705,7 +705,7 @@ static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
 	struct pci_dev *pdev = cfg->dev;
 
 	if (pci_channel_offline(pdev))
-		wait_event_timeout(cfg->limbo_waitq,
+		wait_event_timeout(cfg->reset_waitq,
 				   !pci_channel_offline(pdev),
 				   CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT);
 }
@@ -2313,7 +2313,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	cfg->mcctx = NULL;
 
 	init_waitqueue_head(&cfg->tmf_waitq);
-	init_waitqueue_head(&cfg->limbo_waitq);
+	init_waitqueue_head(&cfg->reset_waitq);
 
 	INIT_WORK(&cfg->work_q, cxlflash_worker_thread);
 	cfg->lr_state = LINK_RESET_INVALID;
@@ -2405,7 +2405,7 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
 
 	switch (state) {
 	case pci_channel_io_frozen:
-		cfg->state = STATE_LIMBO;
+		cfg->state = STATE_RESET;
 		scsi_block_requests(cfg->host);
 		drain_ioctls(cfg);
 		rc = cxlflash_mark_contexts_error(cfg);
@@ -2417,7 +2417,7 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
 		return PCI_ERS_RESULT_NEED_RESET;
 	case pci_channel_io_perm_failure:
 		cfg->state = STATE_FAILTERM;
-		wake_up_all(&cfg->limbo_waitq);
+		wake_up_all(&cfg->reset_waitq);
 		scsi_unblock_requests(cfg->host);
 		return PCI_ERS_RESULT_DISCONNECT;
 	default:
@@ -2464,7 +2464,7 @@ static void cxlflash_pci_resume(struct pci_dev *pdev)
 	dev_dbg(dev, "%s: pdev=%p\n", __func__, pdev);
 
 	cfg->state = STATE_NORMAL;
-	wake_up_all(&cfg->limbo_waitq);
+	wake_up_all(&cfg->reset_waitq);
 	scsi_unblock_requests(cfg->host);
 }
 
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 8a18230..5d51c65 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -100,7 +100,7 @@ void cxlflash_stop_term_user_contexts(struct cxlflash_cfg *cfg)
 
 		dev_dbg(dev, "%s: Wait for user contexts to quiesce...\n",
 			__func__);
-		wake_up_all(&cfg->limbo_waitq);
+		wake_up_all(&cfg->reset_waitq);
 		ssleep(1);
 	}
 }
@@ -1233,12 +1233,12 @@ static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
 
 retry:
 	switch (cfg->state) {
-	case STATE_LIMBO:
-		dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__);
+	case STATE_RESET:
+		dev_dbg(dev, "%s: Reset state, going to wait...\n", __func__);
 		if (ioctl)
 			up_read(&cfg->ioctl_rwsem);
-		rc = wait_event_interruptible(cfg->limbo_waitq,
-					      cfg->state != STATE_LIMBO);
+		rc = wait_event_interruptible(cfg->reset_waitq,
+					      cfg->state != STATE_RESET);
 		if (ioctl)
 			down_read(&cfg->ioctl_rwsem);
 		if (unlikely(rc))
@@ -1581,10 +1581,10 @@ err1:
  * quite possible for this routine to act as the kernel's EEH detection
  * source (MMIO read of mbox_r). Because of this, there is a window of
  * time where an EEH might have been detected but not yet 'serviced'
- * (callback invoked, causing the device to enter limbo state). To avoid
+ * (callback invoked, causing the device to enter reset state). To avoid
  * looping in this routine during that window, a 1 second sleep is in place
  * between the time the MMIO failure is detected and the time a wait on the
- * limbo wait queue is attempted via check_state().
+ * reset wait queue is attempted via check_state().
  *
  * Return: 0 on success, -errno on failure
  */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 11/30] cxlflash: Make functions static
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (9 preceding siblings ...)
  2015-09-16 21:28 ` [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq Matthew R. Ochs
@ 2015-09-16 21:28 ` Matthew R. Ochs
  2015-09-18 15:34   ` Brian King
  2015-09-21 12:18   ` Tomas Henzl
  2015-09-16 21:29 ` [PATCH v2 12/30] cxlflash: Refine host/device attributes Matthew R. Ochs
                   ` (18 subsequent siblings)
  29 siblings, 2 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:28 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Found during code inspection, that the following functions are not
being used outside of the file where they are defined. Make them static.

int cxlflash_send_cmd(struct afu *, struct afu_cmd *);
void cxlflash_wait_resp(struct afu *, struct afu_cmd *);
int cxlflash_afu_reset(struct cxlflash_cfg *);
struct afu_cmd *cxlflash_cmd_checkout(struct afu *);
void cxlflash_cmd_checkin(struct afu_cmd *);
void init_pcr(struct cxlflash_cfg *);
int init_global(struct cxlflash_cfg *);

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |    5 -
 drivers/scsi/cxlflash/main.c   | 1018 ++++++++++++++++++++--------------------
 2 files changed, 509 insertions(+), 514 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 6e0be53..2855b09 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -194,11 +194,6 @@ static inline u64 lun_to_lunid(u64 lun)
 	return swab64(lun_id);
 }
 
-int cxlflash_send_cmd(struct afu *, struct afu_cmd *);
-void cxlflash_wait_resp(struct afu *, struct afu_cmd *);
-int cxlflash_afu_reset(struct cxlflash_cfg *);
-struct afu_cmd *cxlflash_cmd_checkout(struct afu *);
-void cxlflash_cmd_checkin(struct afu_cmd *);
 int cxlflash_afu_sync(struct afu *, ctx_hndl_t, res_hndl_t, u8);
 void cxlflash_list_init(void);
 void cxlflash_term_global_luns(void);
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 01b7f3e..f2f41a7 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -36,7 +36,7 @@ MODULE_LICENSE("GPL");
 
 
 /**
- * cxlflash_cmd_checkout() - checks out an AFU command
+ * cmd_checkout() - checks out an AFU command
  * @afu:	AFU to checkout from.
  *
  * Commands are checked out in a round-robin fashion. Note that since
@@ -47,7 +47,7 @@ MODULE_LICENSE("GPL");
  *
  * Return: The checked out command or NULL when command pool is empty.
  */
-struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
+static struct afu_cmd *cmd_checkout(struct afu *afu)
 {
 	int k, dec = CXLFLASH_NUM_CMDS;
 	struct afu_cmd *cmd;
@@ -70,7 +70,7 @@ struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
 }
 
 /**
- * cxlflash_cmd_checkin() - checks in an AFU command
+ * cmd_checkin() - checks in an AFU command
  * @cmd:	AFU command to checkin.
  *
  * Safe to pass commands that have already been checked in. Several
@@ -79,7 +79,7 @@ struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
  * to avoid clobbering values in the event that the command is checked
  * out right away.
  */
-void cxlflash_cmd_checkin(struct afu_cmd *cmd)
+static void cmd_checkin(struct afu_cmd *cmd)
 {
 	cmd->rcb.scp = NULL;
 	cmd->rcb.timeout = 0;
@@ -238,7 +238,7 @@ static void cmd_complete(struct afu_cmd *cmd)
 
 		resid = cmd->sa.resid;
 		cmd_is_tmf = cmd->cmd_tmf;
-		cxlflash_cmd_checkin(cmd); /* Don't use cmd after here */
+		cmd_checkin(cmd); /* Don't use cmd after here */
 
 		pr_debug("%s: calling scsi_set_resid, scp=%p "
 			 "result=%X resid=%d\n", __func__,
@@ -260,6 +260,146 @@ static void cmd_complete(struct afu_cmd *cmd)
 }
 
 /**
+ * context_reset() - timeout handler for AFU commands
+ * @cmd:	AFU command that timed out.
+ *
+ * Sends a reset to the AFU.
+ */
+static void context_reset(struct afu_cmd *cmd)
+{
+	int nretry = 0;
+	u64 rrin = 0x1;
+	u64 room = 0;
+	struct afu *afu = cmd->parent;
+	ulong lock_flags;
+
+	pr_debug("%s: cmd=%p\n", __func__, cmd);
+
+	spin_lock_irqsave(&cmd->slock, lock_flags);
+
+	/* Already completed? */
+	if (cmd->sa.host_use_b[0] & B_DONE) {
+		spin_unlock_irqrestore(&cmd->slock, lock_flags);
+		return;
+	}
+
+	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
+	spin_unlock_irqrestore(&cmd->slock, lock_flags);
+
+	/*
+	 * We really want to send this reset at all costs, so spread
+	 * out wait time on successive retries for available room.
+	 */
+	do {
+		room = readq_be(&afu->host_map->cmd_room);
+		atomic64_set(&afu->room, room);
+		if (room)
+			goto write_rrin;
+		udelay(nretry);
+	} while (nretry++ < MC_ROOM_RETRY_CNT);
+
+	pr_err("%s: no cmd_room to send reset\n", __func__);
+	return;
+
+write_rrin:
+	nretry = 0;
+	writeq_be(rrin, &afu->host_map->ioarrin);
+	do {
+		rrin = readq_be(&afu->host_map->ioarrin);
+		if (rrin != 0x1)
+			break;
+		/* Double delay each time */
+		udelay(2 ^ nretry);
+	} while (nretry++ < MC_ROOM_RETRY_CNT);
+}
+
+/**
+ * send_cmd() - sends an AFU command
+ * @afu:	AFU associated with the host.
+ * @cmd:	AFU command to send.
+ *
+ * Return:
+ *	0 on success or SCSI_MLQUEUE_HOST_BUSY
+ */
+static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
+{
+	struct cxlflash_cfg *cfg = afu->parent;
+	struct device *dev = &cfg->dev->dev;
+	int nretry = 0;
+	int rc = 0;
+	u64 room;
+	long newval;
+
+	/*
+	 * This routine is used by critical users such an AFU sync and to
+	 * send a task management function (TMF). Thus we want to retry a
+	 * bit before returning an error. To avoid the performance penalty
+	 * of MMIO, we spread the update of 'room' over multiple commands.
+	 */
+retry:
+	newval = atomic64_dec_if_positive(&afu->room);
+	if (!newval) {
+		do {
+			room = readq_be(&afu->host_map->cmd_room);
+			atomic64_set(&afu->room, room);
+			if (room)
+				goto write_ioarrin;
+			udelay(nretry);
+		} while (nretry++ < MC_ROOM_RETRY_CNT);
+
+		dev_err(dev, "%s: no cmd_room to send 0x%X\n",
+		       __func__, cmd->rcb.cdb[0]);
+
+		goto no_room;
+	} else if (unlikely(newval < 0)) {
+		/* This should be rare. i.e. Only if two threads race and
+		 * decrement before the MMIO read is done. In this case
+		 * just benefit from the other thread having updated
+		 * afu->room.
+		 */
+		if (nretry++ < MC_ROOM_RETRY_CNT) {
+			udelay(nretry);
+			goto retry;
+		}
+
+		goto no_room;
+	}
+
+write_ioarrin:
+	writeq_be((u64)&cmd->rcb, &afu->host_map->ioarrin);
+out:
+	pr_devel("%s: cmd=%p len=%d ea=%p rc=%d\n", __func__, cmd,
+		 cmd->rcb.data_len, (void *)cmd->rcb.data_ea, rc);
+	return rc;
+
+no_room:
+	afu->read_room = true;
+	schedule_work(&cfg->work_q);
+	rc = SCSI_MLQUEUE_HOST_BUSY;
+	goto out;
+}
+
+/**
+ * wait_resp() - polls for a response or timeout to a sent AFU command
+ * @afu:	AFU associated with the host.
+ * @cmd:	AFU command that was sent.
+ */
+static void wait_resp(struct afu *afu, struct afu_cmd *cmd)
+{
+	ulong timeout = msecs_to_jiffies(cmd->rcb.timeout * 2 * 1000);
+
+	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
+	if (!timeout)
+		context_reset(cmd);
+
+	if (unlikely(cmd->sa.ioasc != 0))
+		pr_err("%s: CMD 0x%X failed, IOASC: flags 0x%X, afu_rc 0x%X, "
+		       "scsi_rc 0x%X, fc_rc 0x%X\n", __func__, cmd->rcb.cdb[0],
+		       cmd->sa.rc.flags, cmd->sa.rc.afu_rc, cmd->sa.rc.scsi_rc,
+		       cmd->sa.rc.fc_rc);
+}
+
+/**
  * send_tmf() - sends a Task Management Function (TMF)
  * @afu:	AFU to checkout from.
  * @scp:	SCSI command from stack.
@@ -280,7 +420,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	ulong lock_flags;
 	int rc = 0;
 
-	cmd = cxlflash_cmd_checkout(afu);
+	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
 		pr_err("%s: could not get a free command\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
@@ -313,9 +453,9 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
 
 	/* Send the command */
-	rc = cxlflash_send_cmd(afu, cmd);
+	rc = send_cmd(afu, cmd);
 	if (unlikely(rc)) {
-		cxlflash_cmd_checkin(cmd);
+		cmd_checkin(cmd);
 		spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
 		cfg->tmf_active = false;
 		spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
@@ -398,7 +538,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 		break;
 	}
 
-	cmd = cxlflash_cmd_checkout(afu);
+	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
 		pr_err("%s: could not get a free command\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
@@ -438,9 +578,9 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	memcpy(cmd->rcb.cdb, scp->cmnd, sizeof(cmd->rcb.cdb));
 
 	/* Send the command */
-	rc = cxlflash_send_cmd(afu, cmd);
+	rc = send_cmd(afu, cmd);
 	if (unlikely(rc)) {
-		cxlflash_cmd_checkin(cmd);
+		cmd_checkin(cmd);
 		scsi_dma_unmap(scp);
 	}
 
@@ -449,369 +589,55 @@ out:
 }
 
 /**
- * cxlflash_eh_device_reset_handler() - reset a single LUN
- * @scp:	SCSI command to send.
- *
- * Return:
- *	SUCCESS as defined in scsi/scsi.h
- *	FAILED as defined in scsi/scsi.h
+ * cxlflash_wait_for_pci_err_recovery() - wait for error recovery during probe
+ * @cxlflash:	Internal structure associated with the host.
  */
-static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
+static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
 {
-	int rc = SUCCESS;
-	struct Scsi_Host *host = scp->device->host;
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
-	struct afu *afu = cfg->afu;
-	int rcr = 0;
-
-	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
-		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
-		 host->host_no, scp->device->channel,
-		 scp->device->id, scp->device->lun,
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
-
-	switch (cfg->state) {
-	case STATE_NORMAL:
-		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
-		if (unlikely(rcr))
-			rc = FAILED;
-		break;
-	case STATE_RESET:
-		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
-		if (cfg->state == STATE_NORMAL)
-			break;
-		/* fall through */
-	default:
-		rc = FAILED;
-		break;
-	}
+	struct pci_dev *pdev = cfg->dev;
 
-	pr_debug("%s: returning rc=%d\n", __func__, rc);
-	return rc;
+	if (pci_channel_offline(pdev))
+		wait_event_timeout(cfg->reset_waitq,
+				   !pci_channel_offline(pdev),
+				   CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT);
 }
 
 /**
- * cxlflash_eh_host_reset_handler() - reset the host adapter
- * @scp:	SCSI command from stack identifying host.
- *
- * Return:
- *	SUCCESS as defined in scsi/scsi.h
- *	FAILED as defined in scsi/scsi.h
+ * free_mem() - free memory associated with the AFU
+ * @cxlflash:	Internal structure associated with the host.
  */
-static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
+static void free_mem(struct cxlflash_cfg *cfg)
 {
-	int rc = SUCCESS;
-	int rcr = 0;
-	struct Scsi_Host *host = scp->device->host;
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
+	int i;
+	char *buf = NULL;
+	struct afu *afu = cfg->afu;
 
-	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
-		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
-		 host->host_no, scp->device->channel,
-		 scp->device->id, scp->device->lun,
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
+	if (cfg->afu) {
+		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
+			buf = afu->cmd[i].buf;
+			if (!((u64)buf & (PAGE_SIZE - 1)))
+				free_page((ulong)buf);
+		}
 
-	switch (cfg->state) {
-	case STATE_NORMAL:
-		cfg->state = STATE_RESET;
-		scsi_block_requests(cfg->host);
-		cxlflash_mark_contexts_error(cfg);
-		rcr = cxlflash_afu_reset(cfg);
-		if (rcr) {
-			rc = FAILED;
-			cfg->state = STATE_FAILTERM;
-		} else
-			cfg->state = STATE_NORMAL;
-		wake_up_all(&cfg->reset_waitq);
-		scsi_unblock_requests(cfg->host);
-		break;
-	case STATE_RESET:
-		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
-		if (cfg->state == STATE_NORMAL)
-			break;
-		/* fall through */
-	default:
-		rc = FAILED;
-		break;
+		free_pages((ulong)afu, get_order(sizeof(struct afu)));
+		cfg->afu = NULL;
 	}
-
-	pr_debug("%s: returning rc=%d\n", __func__, rc);
-	return rc;
 }
 
 /**
- * cxlflash_change_queue_depth() - change the queue depth for the device
- * @sdev:	SCSI device destined for queue depth change.
- * @qdepth:	Requested queue depth value to set.
- *
- * The requested queue depth is capped to the maximum supported value.
+ * stop_afu() - stops the AFU command timers and unmaps the MMIO space
+ * @cxlflash:	Internal structure associated with the host.
  *
- * Return: The actual queue depth set.
+ * Safe to call with AFU in a partially allocated/initialized state.
  */
-static int cxlflash_change_queue_depth(struct scsi_device *sdev, int qdepth)
+static void stop_afu(struct cxlflash_cfg *cfg)
 {
+	int i;
+	struct afu *afu = cfg->afu;
 
-	if (qdepth > CXLFLASH_MAX_CMDS_PER_LUN)
-		qdepth = CXLFLASH_MAX_CMDS_PER_LUN;
-
-	scsi_change_queue_depth(sdev, qdepth);
-	return sdev->queue_depth;
-}
-
-/**
- * cxlflash_show_port_status() - queries and presents the current port status
- * @dev:	Generic device associated with the host owning the port.
- * @attr:	Device attribute representing the port.
- * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
- *
- * Return: The size of the ASCII string returned in @buf.
- */
-static ssize_t cxlflash_show_port_status(struct device *dev,
-					 struct device_attribute *attr,
-					 char *buf)
-{
-	struct Scsi_Host *shost = class_to_shost(dev);
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
-	struct afu *afu = cfg->afu;
-
-	char *disp_status;
-	int rc;
-	u32 port;
-	u64 status;
-	u64 *fc_regs;
-
-	rc = kstrtouint((attr->attr.name + 4), 10, &port);
-	if (rc || (port >= NUM_FC_PORTS))
-		return 0;
-
-	fc_regs = &afu->afu_map->global.fc_regs[port][0];
-	status =
-	    (readq_be(&fc_regs[FC_MTIP_STATUS / 8]) & FC_MTIP_STATUS_MASK);
-
-	if (status == FC_MTIP_STATUS_ONLINE)
-		disp_status = "online";
-	else if (status == FC_MTIP_STATUS_OFFLINE)
-		disp_status = "offline";
-	else
-		disp_status = "unknown";
-
-	return snprintf(buf, PAGE_SIZE, "%s\n", disp_status);
-}
-
-/**
- * cxlflash_show_lun_mode() - presents the current LUN mode of the host
- * @dev:	Generic device associated with the host.
- * @attr:	Device attribute representing the lun mode.
- * @buf:	Buffer of length PAGE_SIZE to report back the LUN mode in ASCII.
- *
- * Return: The size of the ASCII string returned in @buf.
- */
-static ssize_t cxlflash_show_lun_mode(struct device *dev,
-				      struct device_attribute *attr, char *buf)
-{
-	struct Scsi_Host *shost = class_to_shost(dev);
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
-	struct afu *afu = cfg->afu;
-
-	return snprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
-}
-
-/**
- * cxlflash_store_lun_mode() - sets the LUN mode of the host
- * @dev:	Generic device associated with the host.
- * @attr:	Device attribute representing the lun mode.
- * @buf:	Buffer of length PAGE_SIZE containing the LUN mode in ASCII.
- * @count:	Length of data resizing in @buf.
- *
- * The CXL Flash AFU supports a dummy LUN mode where the external
- * links and storage are not required. Space on the FPGA is used
- * to create 1 or 2 small LUNs which are presented to the system
- * as if they were a normal storage device. This feature is useful
- * during development and also provides manufacturing with a way
- * to test the AFU without an actual device.
- *
- * 0 = external LUN[s] (default)
- * 1 = internal LUN (1 x 64K, 512B blocks, id 0)
- * 2 = internal LUN (1 x 64K, 4K blocks, id 0)
- * 3 = internal LUN (2 x 32K, 512B blocks, ids 0,1)
- * 4 = internal LUN (2 x 32K, 4K blocks, ids 0,1)
- *
- * Return: The size of the ASCII string returned in @buf.
- */
-static ssize_t cxlflash_store_lun_mode(struct device *dev,
-				       struct device_attribute *attr,
-				       const char *buf, size_t count)
-{
-	struct Scsi_Host *shost = class_to_shost(dev);
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
-	struct afu *afu = cfg->afu;
-	int rc;
-	u32 lun_mode;
-
-	rc = kstrtouint(buf, 10, &lun_mode);
-	if (!rc && (lun_mode < 5) && (lun_mode != afu->internal_lun)) {
-		afu->internal_lun = lun_mode;
-		cxlflash_afu_reset(cfg);
-		scsi_scan_host(cfg->host);
-	}
-
-	return count;
-}
-
-/**
- * cxlflash_show_ioctl_version() - presents the current ioctl version of the host
- * @dev:	Generic device associated with the host.
- * @attr:	Device attribute representing the ioctl version.
- * @buf:	Buffer of length PAGE_SIZE to report back the ioctl version.
- *
- * Return: The size of the ASCII string returned in @buf.
- */
-static ssize_t cxlflash_show_ioctl_version(struct device *dev,
-					   struct device_attribute *attr,
-					   char *buf)
-{
-	return scnprintf(buf, PAGE_SIZE, "%u\n", DK_CXLFLASH_VERSION_0);
-}
-
-/**
- * cxlflash_show_dev_mode() - presents the current mode of the device
- * @dev:	Generic device associated with the device.
- * @attr:	Device attribute representing the device mode.
- * @buf:	Buffer of length PAGE_SIZE to report back the dev mode in ASCII.
- *
- * Return: The size of the ASCII string returned in @buf.
- */
-static ssize_t cxlflash_show_dev_mode(struct device *dev,
-				      struct device_attribute *attr, char *buf)
-{
-	struct scsi_device *sdev = to_scsi_device(dev);
-
-	return snprintf(buf, PAGE_SIZE, "%s\n",
-			sdev->hostdata ? "superpipe" : "legacy");
-}
-
-/**
- * cxlflash_wait_for_pci_err_recovery() - wait for error recovery during probe
- * @cxlflash:	Internal structure associated with the host.
- */
-static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
-{
-	struct pci_dev *pdev = cfg->dev;
-
-	if (pci_channel_offline(pdev))
-		wait_event_timeout(cfg->reset_waitq,
-				   !pci_channel_offline(pdev),
-				   CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT);
-}
-
-/*
- * Host attributes
- */
-static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
-static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
-static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
-		   cxlflash_store_lun_mode);
-static DEVICE_ATTR(ioctl_version, S_IRUGO, cxlflash_show_ioctl_version, NULL);
-
-static struct device_attribute *cxlflash_host_attrs[] = {
-	&dev_attr_port0,
-	&dev_attr_port1,
-	&dev_attr_lun_mode,
-	&dev_attr_ioctl_version,
-	NULL
-};
-
-/*
- * Device attributes
- */
-static DEVICE_ATTR(mode, S_IRUGO, cxlflash_show_dev_mode, NULL);
-
-static struct device_attribute *cxlflash_dev_attrs[] = {
-	&dev_attr_mode,
-	NULL
-};
-
-/*
- * Host template
- */
-static struct scsi_host_template driver_template = {
-	.module = THIS_MODULE,
-	.name = CXLFLASH_ADAPTER_NAME,
-	.info = cxlflash_driver_info,
-	.ioctl = cxlflash_ioctl,
-	.proc_name = CXLFLASH_NAME,
-	.queuecommand = cxlflash_queuecommand,
-	.eh_device_reset_handler = cxlflash_eh_device_reset_handler,
-	.eh_host_reset_handler = cxlflash_eh_host_reset_handler,
-	.change_queue_depth = cxlflash_change_queue_depth,
-	.cmd_per_lun = 16,
-	.can_queue = CXLFLASH_MAX_CMDS,
-	.this_id = -1,
-	.sg_tablesize = SG_NONE,	/* No scatter gather support. */
-	.max_sectors = CXLFLASH_MAX_SECTORS,
-	.use_clustering = ENABLE_CLUSTERING,
-	.shost_attrs = cxlflash_host_attrs,
-	.sdev_attrs = cxlflash_dev_attrs,
-};
-
-/*
- * Device dependent values
- */
-static struct dev_dependent_vals dev_corsa_vals = { CXLFLASH_MAX_SECTORS };
-
-/*
- * PCI device binding table
- */
-static struct pci_device_id cxlflash_pci_table[] = {
-	{PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CORSA,
-	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, (kernel_ulong_t)&dev_corsa_vals},
-	{}
-};
-
-MODULE_DEVICE_TABLE(pci, cxlflash_pci_table);
-
-/**
- * free_mem() - free memory associated with the AFU
- * @cxlflash:	Internal structure associated with the host.
- */
-static void free_mem(struct cxlflash_cfg *cfg)
-{
-	int i;
-	char *buf = NULL;
-	struct afu *afu = cfg->afu;
-
-	if (cfg->afu) {
-		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-			buf = afu->cmd[i].buf;
-			if (!((u64)buf & (PAGE_SIZE - 1)))
-				free_page((ulong)buf);
-		}
-
-		free_pages((ulong)afu, get_order(sizeof(struct afu)));
-		cfg->afu = NULL;
-	}
-}
-
-/**
- * stop_afu() - stops the AFU command timers and unmaps the MMIO space
- * @cxlflash:	Internal structure associated with the host.
- *
- * Safe to call with AFU in a partially allocated/initialized state.
- */
-static void stop_afu(struct cxlflash_cfg *cfg)
-{
-	int i;
-	struct afu *afu = cfg->afu;
-
-	if (likely(afu)) {
-		for (i = 0; i < CXLFLASH_NUM_CMDS; i++)
-			complete(&afu->cmd[i].cevent);
+	if (likely(afu)) {
+		for (i = 0; i < CXLFLASH_NUM_CMDS; i++)
+			complete(&afu->cmd[i].cevent);
 
 		if (likely(afu->afu_map)) {
 			cxl_psa_unmap((void *)afu->afu_map);
@@ -1640,67 +1466,13 @@ out:
 }
 
 /**
- * cxlflash_context_reset() - timeout handler for AFU commands
- * @cmd:	AFU command that timed out.
+ * init_pcr() - initialize the provisioning and control registers
+ * @cxlflash:	Internal structure associated with the host.
  *
- * Sends a reset to the AFU.
+ * Also sets up fast access to the mapped registers and initializes AFU
+ * command fields that never change.
  */
-void cxlflash_context_reset(struct afu_cmd *cmd)
-{
-	int nretry = 0;
-	u64 rrin = 0x1;
-	u64 room = 0;
-	struct afu *afu = cmd->parent;
-	ulong lock_flags;
-
-	pr_debug("%s: cmd=%p\n", __func__, cmd);
-
-	spin_lock_irqsave(&cmd->slock, lock_flags);
-
-	/* Already completed? */
-	if (cmd->sa.host_use_b[0] & B_DONE) {
-		spin_unlock_irqrestore(&cmd->slock, lock_flags);
-		return;
-	}
-
-	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
-	spin_unlock_irqrestore(&cmd->slock, lock_flags);
-
-	/*
-	 * We really want to send this reset at all costs, so spread
-	 * out wait time on successive retries for available room.
-	 */
-	do {
-		room = readq_be(&afu->host_map->cmd_room);
-		atomic64_set(&afu->room, room);
-		if (room)
-			goto write_rrin;
-		udelay(nretry);
-	} while (nretry++ < MC_ROOM_RETRY_CNT);
-
-	pr_err("%s: no cmd_room to send reset\n", __func__);
-	return;
-
-write_rrin:
-	nretry = 0;
-	writeq_be(rrin, &afu->host_map->ioarrin);
-	do {
-		rrin = readq_be(&afu->host_map->ioarrin);
-		if (rrin != 0x1)
-			break;
-		/* Double delay each time */
-		udelay(2 ^ nretry);
-	} while (nretry++ < MC_ROOM_RETRY_CNT);
-}
-
-/**
- * init_pcr() - initialize the provisioning and control registers
- * @cxlflash:	Internal structure associated with the host.
- *
- * Also sets up fast access to the mapped registers and initializes AFU
- * command fields that never change.
- */
-void init_pcr(struct cxlflash_cfg *cfg)
+static void init_pcr(struct cxlflash_cfg *cfg)
 {
 	struct afu *afu = cfg->afu;
 	struct sisl_ctrl_map *ctrl_map;
@@ -1736,7 +1508,7 @@ void init_pcr(struct cxlflash_cfg *cfg)
  * init_global() - initialize AFU global registers
  * @cxlflash:	Internal structure associated with the host.
  */
-int init_global(struct cxlflash_cfg *cfg)
+static int init_global(struct cxlflash_cfg *cfg)
 {
 	struct afu *afu = cfg->afu;
 	u64 wwpn[NUM_FC_PORTS];	/* wwpn of AFU ports */
@@ -2007,92 +1779,6 @@ err1:
 }
 
 /**
- * cxlflash_send_cmd() - sends an AFU command
- * @afu:	AFU associated with the host.
- * @cmd:	AFU command to send.
- *
- * Return:
- *	0 on success
- *	-1 on failure
- */
-int cxlflash_send_cmd(struct afu *afu, struct afu_cmd *cmd)
-{
-	struct cxlflash_cfg *cfg = afu->parent;
-	int nretry = 0;
-	int rc = 0;
-	u64 room;
-	long newval;
-
-	/*
-	 * This routine is used by critical users such an AFU sync and to
-	 * send a task management function (TMF). Thus we want to retry a
-	 * bit before returning an error. To avoid the performance penalty
-	 * of MMIO, we spread the update of 'room' over multiple commands.
-	 */
-retry:
-	newval = atomic64_dec_if_positive(&afu->room);
-	if (!newval) {
-		do {
-			room = readq_be(&afu->host_map->cmd_room);
-			atomic64_set(&afu->room, room);
-			if (room)
-				goto write_ioarrin;
-			udelay(nretry);
-		} while (nretry++ < MC_ROOM_RETRY_CNT);
-
-		pr_err("%s: no cmd_room to send 0x%X\n",
-		       __func__, cmd->rcb.cdb[0]);
-
-		goto no_room;
-	} else if (unlikely(newval < 0)) {
-		/* This should be rare. i.e. Only if two threads race and
-		 * decrement before the MMIO read is done. In this case
-		 * just benefit from the other thread having updated
-		 * afu->room.
-		 */
-		if (nretry++ < MC_ROOM_RETRY_CNT) {
-			udelay(nretry);
-			goto retry;
-		}
-
-		goto no_room;
-	}
-
-write_ioarrin:
-	writeq_be((u64)&cmd->rcb, &afu->host_map->ioarrin);
-out:
-	pr_debug("%s: cmd=%p len=%d ea=%p rc=%d\n", __func__, cmd,
-		 cmd->rcb.data_len, (void *)cmd->rcb.data_ea, rc);
-	return rc;
-
-no_room:
-	afu->read_room = true;
-	schedule_work(&cfg->work_q);
-	rc = SCSI_MLQUEUE_HOST_BUSY;
-	goto out;
-}
-
-/**
- * cxlflash_wait_resp() - polls for a response or timeout to a sent AFU command
- * @afu:	AFU associated with the host.
- * @cmd:	AFU command that was sent.
- */
-void cxlflash_wait_resp(struct afu *afu, struct afu_cmd *cmd)
-{
-	ulong timeout = jiffies + (cmd->rcb.timeout * 2 * HZ);
-
-	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
-	if (!timeout)
-		cxlflash_context_reset(cmd);
-
-	if (unlikely(cmd->sa.ioasc != 0))
-		pr_err("%s: CMD 0x%X failed, IOASC: flags 0x%X, afu_rc 0x%X, "
-		       "scsi_rc 0x%X, fc_rc 0x%X\n", __func__, cmd->rcb.cdb[0],
-		       cmd->sa.rc.flags, cmd->sa.rc.afu_rc, cmd->sa.rc.scsi_rc,
-		       cmd->sa.rc.fc_rc);
-}
-
-/**
  * cxlflash_afu_sync() - builds and sends an AFU sync command
  * @afu:	AFU associated with the host.
  * @ctx_hndl_u:	Identifies context requesting sync.
@@ -2130,7 +1816,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 
 	mutex_lock(&sync_active);
 retry:
-	cmd = cxlflash_cmd_checkout(afu);
+	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
 		retry_cnt++;
 		udelay(1000 * retry_cnt);
@@ -2159,11 +1845,11 @@ retry:
 	*((u16 *)&cmd->rcb.cdb[2]) = swab16(ctx_hndl_u);
 	*((u32 *)&cmd->rcb.cdb[4]) = swab32(res_hndl_u);
 
-	rc = cxlflash_send_cmd(afu, cmd);
+	rc = send_cmd(afu, cmd);
 	if (unlikely(rc))
 		goto out;
 
-	cxlflash_wait_resp(afu, cmd);
+	wait_resp(afu, cmd);
 
 	/* set on timeout */
 	if (unlikely((cmd->sa.ioasc != 0) ||
@@ -2172,20 +1858,20 @@ retry:
 out:
 	mutex_unlock(&sync_active);
 	if (cmd)
-		cxlflash_cmd_checkin(cmd);
+		cmd_checkin(cmd);
 	pr_debug("%s: returning rc=%d\n", __func__, rc);
 	return rc;
 }
 
 /**
- * cxlflash_afu_reset() - resets the AFU
- * @cxlflash:	Internal structure associated with the host.
+ * afu_reset() - resets the AFU
+ * @cfg:	Internal structure associated with the host.
  *
  * Return:
  *	0 on success
  *	A failure value from internal services.
  */
-int cxlflash_afu_reset(struct cxlflash_cfg *cfg)
+static int afu_reset(struct cxlflash_cfg *cfg)
 {
 	int rc = 0;
 	/* Stop the context before the reset. Since the context is
@@ -2201,6 +1887,320 @@ int cxlflash_afu_reset(struct cxlflash_cfg *cfg)
 }
 
 /**
+ * cxlflash_eh_device_reset_handler() - reset a single LUN
+ * @scp:	SCSI command to send.
+ *
+ * Return:
+ *	SUCCESS as defined in scsi/scsi.h
+ *	FAILED as defined in scsi/scsi.h
+ */
+static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
+{
+	int rc = SUCCESS;
+	struct Scsi_Host *host = scp->device->host;
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
+	struct afu *afu = cfg->afu;
+	int rcr = 0;
+
+	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
+		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
+		 host->host_no, scp->device->channel,
+		 scp->device->id, scp->device->lun,
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
+
+	switch (cfg->state) {
+	case STATE_NORMAL:
+		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
+		if (unlikely(rcr))
+			rc = FAILED;
+		break;
+	case STATE_RESET:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+		if (cfg->state == STATE_NORMAL)
+			break;
+		/* fall through */
+	default:
+		rc = FAILED;
+		break;
+	}
+
+	pr_debug("%s: returning rc=%d\n", __func__, rc);
+	return rc;
+}
+
+/**
+ * cxlflash_eh_host_reset_handler() - reset the host adapter
+ * @scp:	SCSI command from stack identifying host.
+ *
+ * Return:
+ *	SUCCESS as defined in scsi/scsi.h
+ *	FAILED as defined in scsi/scsi.h
+ */
+static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
+{
+	int rc = SUCCESS;
+	int rcr = 0;
+	struct Scsi_Host *host = scp->device->host;
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
+
+	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
+		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
+		 host->host_no, scp->device->channel,
+		 scp->device->id, scp->device->lun,
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
+		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
+
+	switch (cfg->state) {
+	case STATE_NORMAL:
+		cfg->state = STATE_RESET;
+		scsi_block_requests(cfg->host);
+		cxlflash_mark_contexts_error(cfg);
+		rcr = afu_reset(cfg);
+		if (rcr) {
+			rc = FAILED;
+			cfg->state = STATE_FAILTERM;
+		} else
+			cfg->state = STATE_NORMAL;
+		wake_up_all(&cfg->reset_waitq);
+		scsi_unblock_requests(cfg->host);
+		break;
+	case STATE_RESET:
+		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+		if (cfg->state == STATE_NORMAL)
+			break;
+		/* fall through */
+	default:
+		rc = FAILED;
+		break;
+	}
+
+	pr_debug("%s: returning rc=%d\n", __func__, rc);
+	return rc;
+}
+
+/**
+ * cxlflash_change_queue_depth() - change the queue depth for the device
+ * @sdev:	SCSI device destined for queue depth change.
+ * @qdepth:	Requested queue depth value to set.
+ *
+ * The requested queue depth is capped to the maximum supported value.
+ *
+ * Return: The actual queue depth set.
+ */
+static int cxlflash_change_queue_depth(struct scsi_device *sdev, int qdepth)
+{
+
+	if (qdepth > CXLFLASH_MAX_CMDS_PER_LUN)
+		qdepth = CXLFLASH_MAX_CMDS_PER_LUN;
+
+	scsi_change_queue_depth(sdev, qdepth);
+	return sdev->queue_depth;
+}
+
+/**
+ * cxlflash_show_port_status() - queries and presents the current port status
+ * @dev:	Generic device associated with the host owning the port.
+ * @attr:	Device attribute representing the port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_show_port_status(struct device *dev,
+					 struct device_attribute *attr,
+					 char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	char *disp_status;
+	int rc;
+	u32 port;
+	u64 status;
+	u64 *fc_regs;
+
+	rc = kstrtouint((attr->attr.name + 4), 10, &port);
+	if (rc || (port >= NUM_FC_PORTS))
+		return 0;
+
+	fc_regs = &afu->afu_map->global.fc_regs[port][0];
+	status =
+	    (readq_be(&fc_regs[FC_MTIP_STATUS / 8]) & FC_MTIP_STATUS_MASK);
+
+	if (status == FC_MTIP_STATUS_ONLINE)
+		disp_status = "online";
+	else if (status == FC_MTIP_STATUS_OFFLINE)
+		disp_status = "offline";
+	else
+		disp_status = "unknown";
+
+	return snprintf(buf, PAGE_SIZE, "%s\n", disp_status);
+}
+
+/**
+ * cxlflash_show_lun_mode() - presents the current LUN mode of the host
+ * @dev:	Generic device associated with the host.
+ * @attr:	Device attribute representing the lun mode.
+ * @buf:	Buffer of length PAGE_SIZE to report back the LUN mode in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_show_lun_mode(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
+}
+
+/**
+ * cxlflash_store_lun_mode() - sets the LUN mode of the host
+ * @dev:	Generic device associated with the host.
+ * @attr:	Device attribute representing the lun mode.
+ * @buf:	Buffer of length PAGE_SIZE containing the LUN mode in ASCII.
+ * @count:	Length of data resizing in @buf.
+ *
+ * The CXL Flash AFU supports a dummy LUN mode where the external
+ * links and storage are not required. Space on the FPGA is used
+ * to create 1 or 2 small LUNs which are presented to the system
+ * as if they were a normal storage device. This feature is useful
+ * during development and also provides manufacturing with a way
+ * to test the AFU without an actual device.
+ *
+ * 0 = external LUN[s] (default)
+ * 1 = internal LUN (1 x 64K, 512B blocks, id 0)
+ * 2 = internal LUN (1 x 64K, 4K blocks, id 0)
+ * 3 = internal LUN (2 x 32K, 512B blocks, ids 0,1)
+ * 4 = internal LUN (2 x 32K, 4K blocks, ids 0,1)
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_store_lun_mode(struct device *dev,
+				       struct device_attribute *attr,
+				       const char *buf, size_t count)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+	int rc;
+	u32 lun_mode;
+
+	rc = kstrtouint(buf, 10, &lun_mode);
+	if (!rc && (lun_mode < 5) && (lun_mode != afu->internal_lun)) {
+		afu->internal_lun = lun_mode;
+		afu_reset(cfg);
+		scsi_scan_host(cfg->host);
+	}
+
+	return count;
+}
+
+/**
+ * cxlflash_show_ioctl_version() - presents the hosts current ioctl version
+ * @dev:	Generic device associated with the host.
+ * @attr:	Device attribute representing the ioctl version.
+ * @buf:	Buffer of length PAGE_SIZE to report back the ioctl version.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_show_ioctl_version(struct device *dev,
+					   struct device_attribute *attr,
+					   char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%u\n", DK_CXLFLASH_VERSION_0);
+}
+
+/**
+ * cxlflash_show_dev_mode() - presents the current mode of the device
+ * @dev:	Generic device associated with the device.
+ * @attr:	Device attribute representing the device mode.
+ * @buf:	Buffer of length PAGE_SIZE to report back the dev mode in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_show_dev_mode(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct scsi_device *sdev = to_scsi_device(dev);
+
+	return snprintf(buf, PAGE_SIZE, "%s\n",
+			sdev->hostdata ? "superpipe" : "legacy");
+}
+
+/*
+ * Host attributes
+ */
+static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
+static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
+static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
+		   cxlflash_store_lun_mode);
+static DEVICE_ATTR(ioctl_version, S_IRUGO, cxlflash_show_ioctl_version, NULL);
+
+static struct device_attribute *cxlflash_host_attrs[] = {
+	&dev_attr_port0,
+	&dev_attr_port1,
+	&dev_attr_lun_mode,
+	&dev_attr_ioctl_version,
+	NULL
+};
+
+/*
+ * Device attributes
+ */
+static DEVICE_ATTR(mode, S_IRUGO, cxlflash_show_dev_mode, NULL);
+
+static struct device_attribute *cxlflash_dev_attrs[] = {
+	&dev_attr_mode,
+	NULL
+};
+
+/*
+ * Host template
+ */
+static struct scsi_host_template driver_template = {
+	.module = THIS_MODULE,
+	.name = CXLFLASH_ADAPTER_NAME,
+	.info = cxlflash_driver_info,
+	.ioctl = cxlflash_ioctl,
+	.proc_name = CXLFLASH_NAME,
+	.queuecommand = cxlflash_queuecommand,
+	.eh_device_reset_handler = cxlflash_eh_device_reset_handler,
+	.eh_host_reset_handler = cxlflash_eh_host_reset_handler,
+	.change_queue_depth = cxlflash_change_queue_depth,
+	.cmd_per_lun = 16,
+	.can_queue = CXLFLASH_MAX_CMDS,
+	.this_id = -1,
+	.sg_tablesize = SG_NONE,	/* No scatter gather support. */
+	.max_sectors = CXLFLASH_MAX_SECTORS,
+	.use_clustering = ENABLE_CLUSTERING,
+	.shost_attrs = cxlflash_host_attrs,
+	.sdev_attrs = cxlflash_dev_attrs,
+};
+
+/*
+ * Device dependent values
+ */
+static struct dev_dependent_vals dev_corsa_vals = { CXLFLASH_MAX_SECTORS };
+
+/*
+ * PCI device binding table
+ */
+static struct pci_device_id cxlflash_pci_table[] = {
+	{PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CORSA,
+	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, (kernel_ulong_t)&dev_corsa_vals},
+	{}
+};
+
+MODULE_DEVICE_TABLE(pci, cxlflash_pci_table);
+
+/**
  * cxlflash_worker_thread() - work thread handler for the AFU
  * @work:	Work structure contained within cxlflash associated with host.
  *
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 12/30] cxlflash: Refine host/device attributes
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (10 preceding siblings ...)
  2015-09-16 21:28 ` [PATCH v2 11/30] cxlflash: Make functions static Matthew R. Ochs
@ 2015-09-16 21:29 ` Matthew R. Ochs
  2015-09-18 21:34   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log Matthew R. Ochs
                   ` (17 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:29 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Shane Seymour
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Implement the following suggestions and add two new attributes
to allow for debugging the port LUN table.

 - use scnprintf() instead of snprintf()
 - use DEVICE_ATTR_RO and DEVICE_ATTR_RW

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Shane Seymour <shane.seymour@hp.com>
---
 drivers/scsi/cxlflash/main.c | 180 +++++++++++++++++++++++++++++++++----------
 1 file changed, 138 insertions(+), 42 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index f2f41a7..919dfb1 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -2004,33 +2004,24 @@ static int cxlflash_change_queue_depth(struct scsi_device *sdev, int qdepth)
 
 /**
  * cxlflash_show_port_status() - queries and presents the current port status
- * @dev:	Generic device associated with the host owning the port.
- * @attr:	Device attribute representing the port.
+ * @port:	Desired port for status reporting.
+ * @afu:	AFU owning the specified port.
  * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
  *
  * Return: The size of the ASCII string returned in @buf.
  */
-static ssize_t cxlflash_show_port_status(struct device *dev,
-					 struct device_attribute *attr,
-					 char *buf)
+static ssize_t cxlflash_show_port_status(u32 port, struct afu *afu, char *buf)
 {
-	struct Scsi_Host *shost = class_to_shost(dev);
-	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
-	struct afu *afu = cfg->afu;
-
 	char *disp_status;
-	int rc;
-	u32 port;
 	u64 status;
-	u64 *fc_regs;
+	__be64 __iomem *fc_regs;
 
-	rc = kstrtouint((attr->attr.name + 4), 10, &port);
-	if (rc || (port >= NUM_FC_PORTS))
+	if (port >= NUM_FC_PORTS)
 		return 0;
 
 	fc_regs = &afu->afu_map->global.fc_regs[port][0];
-	status =
-	    (readq_be(&fc_regs[FC_MTIP_STATUS / 8]) & FC_MTIP_STATUS_MASK);
+	status = readq_be(&fc_regs[FC_MTIP_STATUS / 8]);
+	status &= FC_MTIP_STATUS_MASK;
 
 	if (status == FC_MTIP_STATUS_ONLINE)
 		disp_status = "online";
@@ -2039,31 +2030,69 @@ static ssize_t cxlflash_show_port_status(struct device *dev,
 	else
 		disp_status = "unknown";
 
-	return snprintf(buf, PAGE_SIZE, "%s\n", disp_status);
+	return scnprintf(buf, PAGE_SIZE, "%s\n", disp_status);
+}
+
+/**
+ * port0_show() - queries and presents the current status of port 0
+ * @dev:	Generic device associated with the host owning the port.
+ * @attr:	Device attribute representing the port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t port0_show(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	return cxlflash_show_port_status(0, afu, buf);
 }
 
 /**
- * cxlflash_show_lun_mode() - presents the current LUN mode of the host
+ * port1_show() - queries and presents the current status of port 1
+ * @dev:	Generic device associated with the host owning the port.
+ * @attr:	Device attribute representing the port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t port1_show(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	return cxlflash_show_port_status(1, afu, buf);
+}
+
+/**
+ * lun_mode_show() - presents the current LUN mode of the host
  * @dev:	Generic device associated with the host.
- * @attr:	Device attribute representing the lun mode.
+ * @attr:	Device attribute representing the LUN mode.
  * @buf:	Buffer of length PAGE_SIZE to report back the LUN mode in ASCII.
  *
  * Return: The size of the ASCII string returned in @buf.
  */
-static ssize_t cxlflash_show_lun_mode(struct device *dev,
-				      struct device_attribute *attr, char *buf)
+static ssize_t lun_mode_show(struct device *dev,
+			     struct device_attribute *attr, char *buf)
 {
 	struct Scsi_Host *shost = class_to_shost(dev);
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
 	struct afu *afu = cfg->afu;
 
-	return snprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
+	return scnprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
 }
 
 /**
- * cxlflash_store_lun_mode() - sets the LUN mode of the host
+ * lun_mode_store() - sets the LUN mode of the host
  * @dev:	Generic device associated with the host.
- * @attr:	Device attribute representing the lun mode.
+ * @attr:	Device attribute representing the LUN mode.
  * @buf:	Buffer of length PAGE_SIZE containing the LUN mode in ASCII.
  * @count:	Length of data resizing in @buf.
  *
@@ -2082,9 +2111,9 @@ static ssize_t cxlflash_show_lun_mode(struct device *dev,
  *
  * Return: The size of the ASCII string returned in @buf.
  */
-static ssize_t cxlflash_store_lun_mode(struct device *dev,
-				       struct device_attribute *attr,
-				       const char *buf, size_t count)
+static ssize_t lun_mode_store(struct device *dev,
+			      struct device_attribute *attr,
+			      const char *buf, size_t count)
 {
 	struct Scsi_Host *shost = class_to_shost(dev);
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
@@ -2103,58 +2132,125 @@ static ssize_t cxlflash_store_lun_mode(struct device *dev,
 }
 
 /**
- * cxlflash_show_ioctl_version() - presents the hosts current ioctl version
+ * ioctl_version_show() - presents the current ioctl version of the host
  * @dev:	Generic device associated with the host.
  * @attr:	Device attribute representing the ioctl version.
  * @buf:	Buffer of length PAGE_SIZE to report back the ioctl version.
  *
  * Return: The size of the ASCII string returned in @buf.
  */
-static ssize_t cxlflash_show_ioctl_version(struct device *dev,
-					   struct device_attribute *attr,
-					   char *buf)
+static ssize_t ioctl_version_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
 {
 	return scnprintf(buf, PAGE_SIZE, "%u\n", DK_CXLFLASH_VERSION_0);
 }
 
 /**
- * cxlflash_show_dev_mode() - presents the current mode of the device
+ * cxlflash_show_port_lun_table() - queries and presents the port LUN table
+ * @port:	Desired port for status reporting.
+ * @afu:	AFU owning the specified port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t cxlflash_show_port_lun_table(u32 port,
+					    struct afu *afu,
+					    char *buf)
+{
+	int i;
+	ssize_t bytes = 0;
+	__be64 __iomem *fc_port;
+
+	if (port >= NUM_FC_PORTS)
+		return 0;
+
+	fc_port = &afu->afu_map->global.fc_port[port][0];
+
+	for (i = 0; i < CXLFLASH_NUM_VLUNS; i++, buf += 22)
+		bytes += scnprintf(buf, PAGE_SIZE, "%03d: %016llX\n",
+				   i, readq_be(&fc_port[i]));
+	return bytes;
+}
+
+/**
+ * port0_lun_table_show() - presents the current LUN table of port 0
+ * @dev:	Generic device associated with the host owning the port.
+ * @attr:	Device attribute representing the port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t port0_lun_table_show(struct device *dev,
+				    struct device_attribute *attr,
+				    char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	return cxlflash_show_port_lun_table(0, afu, buf);
+}
+
+/**
+ * port1_lun_table_show() - presents the current LUN table of port 1
+ * @dev:	Generic device associated with the host owning the port.
+ * @attr:	Device attribute representing the port.
+ * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
+ *
+ * Return: The size of the ASCII string returned in @buf.
+ */
+static ssize_t port1_lun_table_show(struct device *dev,
+				    struct device_attribute *attr,
+				    char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
+	struct afu *afu = cfg->afu;
+
+	return cxlflash_show_port_lun_table(1, afu, buf);
+}
+
+/**
+ * mode_show() - presents the current mode of the device
  * @dev:	Generic device associated with the device.
  * @attr:	Device attribute representing the device mode.
  * @buf:	Buffer of length PAGE_SIZE to report back the dev mode in ASCII.
  *
  * Return: The size of the ASCII string returned in @buf.
  */
-static ssize_t cxlflash_show_dev_mode(struct device *dev,
-				      struct device_attribute *attr, char *buf)
+static ssize_t mode_show(struct device *dev,
+			 struct device_attribute *attr, char *buf)
 {
 	struct scsi_device *sdev = to_scsi_device(dev);
 
-	return snprintf(buf, PAGE_SIZE, "%s\n",
-			sdev->hostdata ? "superpipe" : "legacy");
+	return scnprintf(buf, PAGE_SIZE, "%s\n",
+			 sdev->hostdata ? "superpipe" : "legacy");
 }
 
 /*
  * Host attributes
  */
-static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
-static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
-static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
-		   cxlflash_store_lun_mode);
-static DEVICE_ATTR(ioctl_version, S_IRUGO, cxlflash_show_ioctl_version, NULL);
+static DEVICE_ATTR_RO(port0);
+static DEVICE_ATTR_RO(port1);
+static DEVICE_ATTR_RW(lun_mode);
+static DEVICE_ATTR_RO(ioctl_version);
+static DEVICE_ATTR_RO(port0_lun_table);
+static DEVICE_ATTR_RO(port1_lun_table);
 
 static struct device_attribute *cxlflash_host_attrs[] = {
 	&dev_attr_port0,
 	&dev_attr_port1,
 	&dev_attr_lun_mode,
 	&dev_attr_ioctl_version,
+	&dev_attr_port0_lun_table,
+	&dev_attr_port1_lun_table,
 	NULL
 };
 
 /*
  * Device attributes
  */
-static DEVICE_ATTR(mode, S_IRUGO, cxlflash_show_dev_mode, NULL);
+static DEVICE_ATTR_RO(mode);
 
 static struct device_attribute *cxlflash_dev_attrs[] = {
 	&dev_attr_mode,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (11 preceding siblings ...)
  2015-09-16 21:29 ` [PATCH v2 12/30] cxlflash: Refine host/device attributes Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-18 21:39   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF Matthew R. Ochs
                   ` (16 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

During run-time the driver can be very chatty and spam the system
kernel log. Various print statements can be limited and/or moved
to development-only mode. Additionally, numerous prints can be
converted to trace the corresponding device.

The following changes were made:
 - pr_debug to pr_devel
 - pr_debug to pr_debug_ratelimited
 - pr_err to dev_err
 - pr_debug to dev_dbg

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 105 ++++++++++++++++++++++++-------------------
 1 file changed, 58 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 919dfb1..600c7f9 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -58,8 +58,8 @@ static struct afu_cmd *cmd_checkout(struct afu *afu)
 		cmd = &afu->cmd[k];
 
 		if (!atomic_dec_if_positive(&cmd->free)) {
-			pr_debug("%s: returning found index=%d\n",
-				 __func__, cmd->slot);
+			pr_devel("%s: returning found index=%d cmd=%p\n",
+				 __func__, cmd->slot, cmd);
 			memset(cmd->buf, 0, CMD_BUFSIZE);
 			memset(cmd->rcb.cdb, 0, sizeof(cmd->rcb.cdb));
 			return cmd;
@@ -93,7 +93,7 @@ static void cmd_checkin(struct afu_cmd *cmd)
 		return;
 	}
 
-	pr_debug("%s: released cmd %p index=%d\n", __func__, cmd, cmd->slot);
+	pr_devel("%s: released cmd %p index=%d\n", __func__, cmd, cmd->slot);
 }
 
 /**
@@ -127,7 +127,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 	}
 
 	pr_debug("%s: cmd failed afu_rc=%d scsi_rc=%d fc_rc=%d "
-		 "afu_extra=0x%X, scsi_entra=0x%X, fc_extra=0x%X\n",
+		 "afu_extra=0x%X, scsi_extra=0x%X, fc_extra=0x%X\n",
 		 __func__, ioasa->rc.afu_rc, ioasa->rc.scsi_rc,
 		 ioasa->rc.fc_rc, ioasa->afu_extra, ioasa->scsi_extra,
 		 ioasa->fc_extra);
@@ -240,9 +240,9 @@ static void cmd_complete(struct afu_cmd *cmd)
 		cmd_is_tmf = cmd->cmd_tmf;
 		cmd_checkin(cmd); /* Don't use cmd after here */
 
-		pr_debug("%s: calling scsi_set_resid, scp=%p "
-			 "result=%X resid=%d\n", __func__,
-			 scp, scp->result, resid);
+		pr_debug_ratelimited("%s: calling scsi_done scp=%p result=%X "
+				     "ioasc=%d\n", __func__, scp, scp->result,
+				     cmd->sa.ioasc);
 
 		scsi_set_resid(scp, resid);
 		scsi_dma_unmap(scp);
@@ -417,12 +417,13 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	short lflag = 0;
 	struct Scsi_Host *host = scp->device->host;
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
+	struct device *dev = &cfg->dev->dev;
 	ulong lock_flags;
 	int rc = 0;
 
 	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
-		pr_err("%s: could not get a free command\n", __func__);
+		dev_err(dev, "%s: could not get a free command\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
 	}
@@ -493,7 +494,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 {
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
 	struct afu *afu = cfg->afu;
-	struct pci_dev *pdev = cfg->dev;
+	struct device *dev = &cfg->dev->dev;
 	struct afu_cmd *cmd;
 	u32 port_sel = scp->device->channel + 1;
 	int nseg, i, ncount;
@@ -502,13 +503,14 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	short lflag = 0;
 	int rc = 0;
 
-	pr_debug("%s: (scp=%p) %d/%d/%d/%llu cdb=(%08X-%08X-%08X-%08X)\n",
-		 __func__, scp, host->host_no, scp->device->channel,
-		 scp->device->id, scp->device->lun,
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
-		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
+	dev_dbg_ratelimited(dev, "%s: (scp=%p) %d/%d/%d/%llu "
+			    "cdb=(%08X-%08X-%08X-%08X)\n",
+			    __func__, scp, host->host_no, scp->device->channel,
+			    scp->device->id, scp->device->lun,
+			    get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
+			    get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
+			    get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
+			    get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
 
 	/* If a Task Management Function is active, wait for it to complete
 	 * before continuing with regular commands.
@@ -523,13 +525,11 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 
 	switch (cfg->state) {
 	case STATE_RESET:
-		dev_dbg_ratelimited(&cfg->dev->dev, "%s: device is in reset!\n",
-				    __func__);
+		dev_dbg_ratelimited(dev, "%s: device is in reset!\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
 	case STATE_FAILTERM:
-		dev_dbg_ratelimited(&cfg->dev->dev, "%s: device has failed!\n",
-				    __func__);
+		dev_dbg_ratelimited(dev, "%s: device has failed!\n", __func__);
 		scp->result = (DID_NO_CONNECT << 16);
 		scp->scsi_done(scp);
 		rc = 0;
@@ -540,7 +540,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 
 	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
-		pr_err("%s: could not get a free command\n", __func__);
+		dev_err(dev, "%s: could not get a free command\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
 	}
@@ -562,7 +562,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 
 	nseg = scsi_dma_map(scp);
 	if (unlikely(nseg < 0)) {
-		dev_err(&pdev->dev, "%s: Fail DMA map! nseg=%d\n",
+		dev_err(dev, "%s: Fail DMA map! nseg=%d\n",
 			__func__, nseg);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
@@ -585,6 +585,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	}
 
 out:
+	pr_devel("%s: returning rc=%d\n", __func__, rc);
 	return rc;
 }
 
@@ -657,9 +658,10 @@ static void term_mc(struct cxlflash_cfg *cfg, enum undo_level level)
 {
 	int rc = 0;
 	struct afu *afu = cfg->afu;
+	struct device *dev = &cfg->dev->dev;
 
 	if (!afu || !cfg->mcctx) {
-		pr_err("%s: returning from term_mc with NULL afu or MC\n",
+		dev_err(dev, "%s: returning from term_mc with NULL afu or MC\n",
 		       __func__);
 		return;
 	}
@@ -756,6 +758,7 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 	int rc = 0;
 	int i;
 	char *buf = NULL;
+	struct device *dev = &cfg->dev->dev;
 
 	/* This allocation is about 12K, i.e. only 1 64k page
 	 * and upto 4 4k pages
@@ -763,8 +766,8 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 	cfg->afu = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
 					    get_order(sizeof(struct afu)));
 	if (unlikely(!cfg->afu)) {
-		pr_err("%s: cannot get %d free pages\n",
-		       __func__, get_order(sizeof(struct afu)));
+		dev_err(dev, "%s: cannot get %d free pages\n",
+			__func__, get_order(sizeof(struct afu)));
 		rc = -ENOMEM;
 		goto out;
 	}
@@ -775,7 +778,8 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 		if (!((u64)buf & (PAGE_SIZE - 1))) {
 			buf = (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
 			if (unlikely(!buf)) {
-				pr_err("%s: Allocate command buffers fail!\n",
+				dev_err(dev,
+					"%s: Allocate command buffers fail!\n",
 				       __func__);
 				rc = -ENOMEM;
 				free_mem(cfg);
@@ -1289,6 +1293,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 {
 	struct afu *afu = (struct afu *)data;
 	struct cxlflash_cfg *cfg = afu->parent;
+	struct device *dev = &cfg->dev->dev;
 	u64 reg_unmasked;
 	const struct asyc_intr_info *info;
 	struct sisl_global_map *global = &afu->afu_map->global;
@@ -1303,8 +1308,8 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 	reg_unmasked = (reg & SISL_ASTATUS_UNMASK);
 
 	if (reg_unmasked == 0) {
-		pr_err("%s: spurious interrupt, aintr_status 0x%016llX\n",
-		       __func__, reg);
+		dev_err(dev, "%s: spurious interrupt, aintr_status 0x%016llX\n",
+			__func__, reg);
 		goto out;
 	}
 
@@ -1319,8 +1324,8 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 
 		port = info->port;
 
-		pr_err("%s: FC Port %d -> %s, fc_status 0x%08llX\n",
-		       __func__, port, info->desc,
+		dev_err(dev, "%s: FC Port %d -> %s, fc_status 0x%08llX\n",
+			__func__, port, info->desc,
 		       readq_be(&global->fc_regs[port][FC_STATUS / 8]));
 
 		/*
@@ -1328,8 +1333,8 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 		 * again if cleared before or w/o a reset
 		 */
 		if (info->action & LINK_RESET) {
-			pr_err("%s: FC Port %d: resetting link\n",
-			       __func__, port);
+			dev_err(dev, "%s: FC Port %d: resetting link\n",
+				__func__, port);
 			cfg->lr_state = LINK_RESET_REQUIRED;
 			cfg->lr_port = port;
 			schedule_work(&cfg->work_q);
@@ -1343,8 +1348,8 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 			 * should be the same and tracing one is sufficient.
 			 */
 
-			pr_err("%s: fc %d: clearing fc_error 0x%08llX\n",
-			       __func__, port, reg);
+			dev_err(dev, "%s: fc %d: clearing fc_error 0x%08llX\n",
+				__func__, port, reg);
 
 			writeq_be(reg, &global->fc_regs[port][FC_ERROR / 8]);
 			writeq_be(0, &global->fc_regs[port][FC_ERRCAP / 8]);
@@ -1352,7 +1357,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 	}
 
 out:
-	pr_debug("%s: returning rc=%d, afu=%p\n", __func__, IRQ_HANDLED, afu);
+	dev_dbg(dev, "%s: returning IRQ_HANDLED, afu=%p\n", __func__, afu);
 	return IRQ_HANDLED;
 }
 
@@ -1396,7 +1401,7 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 	/* Get the VPD data from the device */
 	vpd_size = pci_read_vpd(dev, 0, sizeof(vpd_data), vpd_data);
 	if (unlikely(vpd_size <= 0)) {
-		pr_err("%s: Unable to read VPD (size = %ld)\n",
+		dev_err(&dev->dev, "%s: Unable to read VPD (size = %ld)\n",
 		       __func__, vpd_size);
 		rc = -ENODEV;
 		goto out;
@@ -1406,7 +1411,8 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 	ro_start = pci_vpd_find_tag(vpd_data, 0, vpd_size,
 				    PCI_VPD_LRDT_RO_DATA);
 	if (unlikely(ro_start < 0)) {
-		pr_err("%s: VPD Read-only data not found\n", __func__);
+		dev_err(&dev->dev, "%s: VPD Read-only data not found\n",
+			__func__);
 		rc = -ENODEV;
 		goto out;
 	}
@@ -1435,8 +1441,8 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 
 		i = pci_vpd_find_info_keyword(vpd_data, i, j, wwpn_vpd_tags[k]);
 		if (unlikely(i < 0)) {
-			pr_err("%s: Port %d WWPN not found in VPD\n",
-			       __func__, k);
+			dev_err(&dev->dev, "%s: Port %d WWPN not found "
+				"in VPD\n", __func__, k);
 			rc = -ENODEV;
 			goto out;
 		}
@@ -1444,7 +1450,8 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 		j = pci_vpd_info_field_size(&vpd_data[i]);
 		i += PCI_VPD_INFO_FLD_HDR_SIZE;
 		if (unlikely((i + j > vpd_size) || (j != WWPN_LEN))) {
-			pr_err("%s: Port %d WWPN incomplete or VPD corrupt\n",
+			dev_err(&dev->dev, "%s: Port %d WWPN incomplete or "
+				"VPD corrupt\n",
 			       __func__, k);
 			rc = -ENODEV;
 			goto out;
@@ -1453,8 +1460,8 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 		memcpy(tmp_buf, &vpd_data[i], WWPN_LEN);
 		rc = kstrtoul(tmp_buf, WWPN_LEN, (ulong *)&wwpn[k]);
 		if (unlikely(rc)) {
-			pr_err("%s: Fail to convert port %d WWPN to integer\n",
-			       __func__, k);
+			dev_err(&dev->dev, "%s: Fail to convert port %d WWPN "
+				"to integer\n", __func__, k);
 			rc = -ENODEV;
 			goto out;
 		}
@@ -1511,6 +1518,7 @@ static void init_pcr(struct cxlflash_cfg *cfg)
 static int init_global(struct cxlflash_cfg *cfg)
 {
 	struct afu *afu = cfg->afu;
+	struct device *dev = &cfg->dev->dev;
 	u64 wwpn[NUM_FC_PORTS];	/* wwpn of AFU ports */
 	int i = 0, num_ports = 0;
 	int rc = 0;
@@ -1518,7 +1526,7 @@ static int init_global(struct cxlflash_cfg *cfg)
 
 	rc = read_vpd(cfg, &wwpn[0]);
 	if (rc) {
-		pr_err("%s: could not read vpd rc=%d\n", __func__, rc);
+		dev_err(dev, "%s: could not read vpd rc=%d\n", __func__, rc);
 		goto out;
 	}
 
@@ -1561,7 +1569,7 @@ static int init_global(struct cxlflash_cfg *cfg)
 		    afu_set_wwpn(afu, i,
 				 &afu->afu_map->global.fc_regs[i][0],
 				 wwpn[i])) {
-			pr_err("%s: failed to set WWPN on port %d\n",
+			dev_err(dev, "%s: failed to set WWPN on port %d\n",
 			       __func__, i);
 			rc = -EIO;
 			goto out;
@@ -1804,6 +1812,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 		      res_hndl_t res_hndl_u, u8 mode)
 {
 	struct cxlflash_cfg *cfg = afu->parent;
+	struct device *dev = &cfg->dev->dev;
 	struct afu_cmd *cmd = NULL;
 	int rc = 0;
 	int retry_cnt = 0;
@@ -1822,7 +1831,7 @@ retry:
 		udelay(1000 * retry_cnt);
 		if (retry_cnt < MC_RETRY_CNT)
 			goto retry;
-		pr_err("%s: could not get a free command\n", __func__);
+		dev_err(dev, "%s: could not get a free command\n", __func__);
 		rc = -1;
 		goto out;
 	}
@@ -2310,6 +2319,7 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	struct cxlflash_cfg *cfg = container_of(work, struct cxlflash_cfg,
 						work_q);
 	struct afu *afu = cfg->afu;
+	struct device *dev = &cfg->dev->dev;
 	int port;
 	ulong lock_flags;
 
@@ -2323,7 +2333,8 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	if (cfg->lr_state == LINK_RESET_REQUIRED) {
 		port = cfg->lr_port;
 		if (port < 0)
-			pr_err("%s: invalid port index %d\n", __func__, port);
+			dev_err(dev, "%s: invalid port index %d\n",
+				__func__, port);
 		else {
 			spin_unlock_irqrestore(cfg->host->host_lock,
 					       lock_flags);
@@ -2428,7 +2439,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	 */
 	phys_dev = cxl_get_phys_dev(pdev);
 	if (!dev_is_pci(phys_dev)) {
-		pr_err("%s: not a pci dev\n", __func__);
+		dev_err(&pdev->dev, "%s: not a pci dev\n", __func__);
 		rc = -ENODEV;
 		goto out_remove;
 	}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (12 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-21 18:24   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 15/30] cxlflash: Fix location of setting resid Matthew R. Ochs
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Borrowing the TMF waitq's spinlock causes a stall condition when
waiting for the TMF to complete. To remedy, introduce our own spin
lock to serialize TMF and use the appropriate wait services.

Also add a timeout while waiting for a TMF completion. When a TMF
times out, report back a failure such that a bigger hammer reset
can occur.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  1 +
 drivers/scsi/cxlflash/main.c   | 55 +++++++++++++++++++++++++-----------------
 2 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 2855b09..c8327ac 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -126,6 +126,7 @@ struct cxlflash_cfg {
 	struct list_head lluns; /* list of llun_info structs */
 
 	wait_queue_head_t tmf_waitq;
+	spinlock_t tmf_slock;
 	bool tmf_active;
 	wait_queue_head_t reset_waitq;
 	enum cxlflash_state state;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 600c7f9..29e40cc 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -249,11 +249,10 @@ static void cmd_complete(struct afu_cmd *cmd)
 		scp->scsi_done(scp);
 
 		if (cmd_is_tmf) {
-			spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
+			spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 			cfg->tmf_active = false;
 			wake_up_all_locked(&cfg->tmf_waitq);
-			spin_unlock_irqrestore(&cfg->tmf_waitq.lock,
-					       lock_flags);
+			spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 		}
 	} else
 		complete(&cmd->cevent);
@@ -420,6 +419,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	struct device *dev = &cfg->dev->dev;
 	ulong lock_flags;
 	int rc = 0;
+	ulong to;
 
 	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
@@ -428,15 +428,15 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 		goto out;
 	}
 
-	/* If a Task Management Function is active, do not send one more.
-	 */
-	spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
+	/* When Task Management Function is active do not send another */
+	spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 	if (cfg->tmf_active)
-		wait_event_interruptible_locked_irq(cfg->tmf_waitq,
-						    !cfg->tmf_active);
+		wait_event_interruptible_lock_irq(cfg->tmf_waitq,
+						  !cfg->tmf_active,
+						  cfg->tmf_slock);
 	cfg->tmf_active = true;
 	cmd->cmd_tmf = true;
-	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
 	cmd->rcb.ctx_id = afu->ctx_hndl;
 	cmd->rcb.port_sel = port_sel;
@@ -457,15 +457,24 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	rc = send_cmd(afu, cmd);
 	if (unlikely(rc)) {
 		cmd_checkin(cmd);
-		spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
+		spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 		cfg->tmf_active = false;
-		spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+		spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 		goto out;
 	}
 
-	spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
-	wait_event_interruptible_locked_irq(cfg->tmf_waitq, !cfg->tmf_active);
-	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+	spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
+	to = msecs_to_jiffies(5000);
+	to = wait_event_interruptible_lock_irq_timeout(cfg->tmf_waitq,
+						       !cfg->tmf_active,
+						       cfg->tmf_slock,
+						       to);
+	if (!to) {
+		cfg->tmf_active = false;
+		dev_err(dev, "%s: TMF timed out!\n", __func__);
+		rc = -1;
+	}
+	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 out:
 	return rc;
 }
@@ -512,16 +521,17 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 			    get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
 			    get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
 
-	/* If a Task Management Function is active, wait for it to complete
+	/*
+	 * If a Task Management Function is active, wait for it to complete
 	 * before continuing with regular commands.
 	 */
-	spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
+	spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 	if (cfg->tmf_active) {
-		spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+		spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
 	}
-	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
 	switch (cfg->state) {
 	case STATE_RESET:
@@ -713,11 +723,12 @@ static void cxlflash_remove(struct pci_dev *pdev)
 	/* If a Task Management Function is active, wait for it to complete
 	 * before continuing with remove.
 	 */
-	spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
+	spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 	if (cfg->tmf_active)
-		wait_event_interruptible_locked_irq(cfg->tmf_waitq,
-						    !cfg->tmf_active);
-	spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
+		wait_event_interruptible_lock_irq(cfg->tmf_waitq,
+						  !cfg->tmf_active,
+						  cfg->tmf_slock);
+	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
 	cfg->state = STATE_FAILTERM;
 	atomic_inc(&cfg->remove_active);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 15/30] cxlflash: Fix location of setting resid
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (13 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-21 18:28   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 16/30] cxlflash: Fix host link up event handling Matthew R. Ochs
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The resid is incorrectly set which can lead to unnecessary retry
attempts by the stack. This is due to resid _always_ being set
using a value returned from the adapter. Instead, the value
should only be interpreted and set when in an underrun scenario.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 29e40cc..b3838af 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -107,6 +107,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 {
 	struct sisl_ioarcb *ioarcb;
 	struct sisl_ioasa *ioasa;
+	u32 resid;
 
 	if (unlikely(!cmd))
 		return;
@@ -115,9 +116,10 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 	ioasa = &(cmd->sa);
 
 	if (ioasa->rc.flags & SISL_RC_FLAGS_UNDERRUN) {
-		pr_debug("%s: cmd underrun cmd = %p scp = %p\n",
-			 __func__, cmd, scp);
-		scp->result = (DID_ERROR << 16);
+		resid = ioasa->resid;
+		scsi_set_resid(scp, resid);
+		pr_debug("%s: cmd underrun cmd = %p scp = %p, resid = %d\n",
+			 __func__, cmd, scp, resid);
 	}
 
 	if (ioasa->rc.flags & SISL_RC_FLAGS_OVERRUN) {
@@ -158,8 +160,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 				/* If the SISL_RC_FLAGS_OVERRUN flag was set,
 				 * then we will handle this error else where.
 				 * If not then we must handle it here.
-				 * This is probably an AFU bug. We will
-				 * attempt a retry to see if that resolves it.
+				 * This is probably an AFU bug.
 				 */
 				scp->result = (DID_ERROR << 16);
 			}
@@ -183,7 +184,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 		/* We have an AFU error */
 		switch (ioasa->rc.afu_rc) {
 		case SISL_AFU_RC_NO_CHANNELS:
-			scp->result = (DID_MEDIUM_ERROR << 16);
+			scp->result = (DID_NO_CONNECT << 16);
 			break;
 		case SISL_AFU_RC_DATA_DMA_ERR:
 			switch (ioasa->afu_extra) {
@@ -217,7 +218,6 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 static void cmd_complete(struct afu_cmd *cmd)
 {
 	struct scsi_cmnd *scp;
-	u32 resid;
 	ulong lock_flags;
 	struct afu *afu = cmd->parent;
 	struct cxlflash_cfg *cfg = afu->parent;
@@ -229,14 +229,11 @@ static void cmd_complete(struct afu_cmd *cmd)
 
 	if (cmd->rcb.scp) {
 		scp = cmd->rcb.scp;
-		if (unlikely(cmd->sa.rc.afu_rc ||
-			     cmd->sa.rc.scsi_rc ||
-			     cmd->sa.rc.fc_rc))
+		if (unlikely(cmd->sa.ioasc))
 			process_cmd_err(cmd, scp);
 		else
 			scp->result = (DID_OK << 16);
 
-		resid = cmd->sa.resid;
 		cmd_is_tmf = cmd->cmd_tmf;
 		cmd_checkin(cmd); /* Don't use cmd after here */
 
@@ -244,7 +241,6 @@ static void cmd_complete(struct afu_cmd *cmd)
 				     "ioasc=%d\n", __func__, scp, scp->result,
 				     cmd->sa.ioasc);
 
-		scsi_set_resid(scp, resid);
 		scsi_dma_unmap(scp);
 		scp->scsi_done(scp);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 16/30] cxlflash: Fix host link up event handling
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (14 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 15/30] cxlflash: Fix location of setting resid Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-21 21:47   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic Matthew R. Ochs
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Following a link up event, the LUNs available to the host may
have changed. Without rescanning the host, the LUN topology is
unknown to the user. In such a state, the user would be unable
to locate provisioned resources.

To remedy, the host should be rescanned after a link up event.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  2 +-
 drivers/scsi/cxlflash/main.c   | 17 +++++++++++++----
 drivers/scsi/cxlflash/main.h   |  1 +
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index c8327ac..517da25 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -102,7 +102,7 @@ struct cxlflash_cfg {
 	enum cxlflash_init_state init_state;
 	enum cxlflash_lr_state lr_state;
 	int lr_port;
-
+	atomic_t scan_host_needed;
 	atomic_t remove_active;
 
 	struct cxl_afu *cxl_afu;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index b3838af..39ad7a3 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1120,17 +1120,17 @@ static const struct asyc_intr_info ainfo[] = {
 	{SISL_ASTATUS_FC0_CRC_T, "CRC threshold exceeded", 0, LINK_RESET},
 	{SISL_ASTATUS_FC0_LOGI_R, "login timed out, retrying", 0, 0},
 	{SISL_ASTATUS_FC0_LOGI_F, "login failed", 0, CLR_FC_ERROR},
-	{SISL_ASTATUS_FC0_LOGI_S, "login succeeded", 0, 0},
+	{SISL_ASTATUS_FC0_LOGI_S, "login succeeded", 0, SCAN_HOST},
 	{SISL_ASTATUS_FC0_LINK_DN, "link down", 0, 0},
-	{SISL_ASTATUS_FC0_LINK_UP, "link up", 0, 0},
+	{SISL_ASTATUS_FC0_LINK_UP, "link up", 0, SCAN_HOST},
 	{SISL_ASTATUS_FC1_OTHER, "other error", 1, CLR_FC_ERROR | LINK_RESET},
 	{SISL_ASTATUS_FC1_LOGO, "target initiated LOGO", 1, 0},
 	{SISL_ASTATUS_FC1_CRC_T, "CRC threshold exceeded", 1, LINK_RESET},
 	{SISL_ASTATUS_FC1_LOGI_R, "login timed out, retrying", 1, 0},
 	{SISL_ASTATUS_FC1_LOGI_F, "login failed", 1, CLR_FC_ERROR},
-	{SISL_ASTATUS_FC1_LOGI_S, "login succeeded", 1, 0},
+	{SISL_ASTATUS_FC1_LOGI_S, "login succeeded", 1, SCAN_HOST},
 	{SISL_ASTATUS_FC1_LINK_DN, "link down", 1, 0},
-	{SISL_ASTATUS_FC1_LINK_UP, "link up", 1, 0},
+	{SISL_ASTATUS_FC1_LINK_UP, "link up", 1, SCAN_HOST},
 	{0x0, "", 0, 0}		/* terminator */
 };
 
@@ -1361,6 +1361,11 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 			writeq_be(reg, &global->fc_regs[port][FC_ERROR / 8]);
 			writeq_be(0, &global->fc_regs[port][FC_ERRCAP / 8]);
 		}
+
+		if (info->action & SCAN_HOST) {
+			atomic_inc(&cfg->scan_host_needed);
+			schedule_work(&cfg->work_q);
+		}
 	}
 
 out:
@@ -2320,6 +2325,7 @@ MODULE_DEVICE_TABLE(pci, cxlflash_pci_table);
  * - Link reset which cannot be performed on interrupt context due to
  * blocking up to a few seconds
  * - Read AFU command room
+ * - Rescan the host
  */
 static void cxlflash_worker_thread(struct work_struct *work)
 {
@@ -2362,6 +2368,9 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	}
 
 	spin_unlock_irqrestore(cfg->host->host_lock, lock_flags);
+
+	if (atomic_dec_if_positive(&cfg->scan_host_needed) >= 0)
+		scsi_scan_host(cfg->host);
 }
 
 /**
diff --git a/drivers/scsi/cxlflash/main.h b/drivers/scsi/cxlflash/main.h
index cf0e809..6032456 100644
--- a/drivers/scsi/cxlflash/main.h
+++ b/drivers/scsi/cxlflash/main.h
@@ -99,6 +99,7 @@ struct asyc_intr_info {
 	u8 action;
 #define CLR_FC_ERROR	0x01
 #define LINK_RESET	0x02
+#define SCAN_HOST	0x04
 };
 
 #ifndef CONFIG_CXL_EEH
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (15 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 16/30] cxlflash: Fix host link up event handling Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-21 21:48   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 18/30] cxlflash: Remove dual port online dependency Matthew R. Ochs
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

A bug was introduced earlier in the development cycle when cleaning
up logic statements. Instead of skipping bits that are not set, set
bits are skipped, causing async interrupts to not be handled correctly.

To fix, simply add back in the proper evaluation for an unset bit.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 39ad7a3..74eb742 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1326,7 +1326,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 	/* check each bit that is on */
 	for (i = 0; reg_unmasked; i++, reg_unmasked = (reg_unmasked >> 1)) {
 		info = find_ainfo(1ULL << i);
-		if ((reg_unmasked & 0x1) || !info)
+		if (((reg_unmasked & 0x1) == 0) || !info)
 			continue;
 
 		port = info->port;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 18/30] cxlflash: Remove dual port online dependency
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (16 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-21 22:02   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check Matthew R. Ochs
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

At present, both ports must be online for the device to
configure properly. Remove this dependency and the unnecessary
internal LUN override logic as well. Additionally, as a refactoring
measure, change the return code variable name to match that used
throughout the driver.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 74eb742..e2cc410 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1031,7 +1031,7 @@ static int wait_port_offline(u64 *fc_regs, u32 delay_us, u32 nretry)
  */
 static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
 {
-	int ret = 0;
+	int rc = 0;
 
 	set_port_offline(fc_regs);
 
@@ -1039,33 +1039,26 @@ static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
 			       FC_PORT_STATUS_RETRY_CNT)) {
 		pr_debug("%s: wait on port %d to go offline timed out\n",
 			 __func__, port);
-		ret = -1; /* but continue on to leave the port back online */
+		rc = -1; /* but continue on to leave the port back online */
 	}
 
-	if (ret == 0)
+	if (rc == 0)
 		writeq_be(wwpn, &fc_regs[FC_PNAME / 8]);
 
+	/* Always return success after programming WWPN */
+	rc = 0;
+
 	set_port_online(fc_regs);
 
 	if (!wait_port_online(fc_regs, FC_PORT_STATUS_RETRY_INTERVAL_US,
 			      FC_PORT_STATUS_RETRY_CNT)) {
 		pr_debug("%s: wait on port %d to go online timed out\n",
 			 __func__, port);
-		ret = -1;
-
-		/*
-		 * Override for internal lun!!!
-		 */
-		if (afu->internal_lun) {
-			pr_debug("%s: Overriding port %d online timeout!!!\n",
-				 __func__, port);
-			ret = 0;
-		}
 	}
 
-	pr_debug("%s: returning rc=%d\n", __func__, ret);
+	pr_debug("%s: returning rc=%d\n", __func__, rc);
 
-	return ret;
+	return rc;
 }
 
 /**
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (17 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 18/30] cxlflash: Remove dual port online dependency Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-22 20:47   ` Brian King
  2015-09-16 21:30 ` [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put() Matthew R. Ochs
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The AFU version is stored as a non-terminated string of bytes within
a 64-bit little-endian register. Presently the value is read directly
(no MMIO accessor) and is stored in a buffer that is not big enough
to contain a NULL terminator. Additionally the version obtained is not
evaluated against a known value to prevent usage with unsupported AFUs.
All of these deficiencies can lead to a variety of problems.

To remedy, use the correct MMIO accessor to read the version value into
a null-terminated buffer and add a check to prevent an incompatible AFU
from being used with this driver.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h  |  2 +-
 drivers/scsi/cxlflash/main.c    | 18 ++++++++++++------
 drivers/scsi/cxlflash/sislite.h |  2 +-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 517da25..f392319 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -180,7 +180,7 @@ struct afu {
 	u32 cmd_couts;		/* Number of command checkouts */
 	u32 internal_lun;	/* User-desired LUN mode for this AFU */
 
-	char version[8];
+	char version[16];
 	u64 interface_version;
 
 	struct cxlflash_cfg *parent; /* Pointer back to parent cxlflash_cfg */
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index e2cc410..fc77cd4 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1762,14 +1762,20 @@ static int init_afu(struct cxlflash_cfg *cfg)
 		goto err1;
 	}
 
-	/* don't byte reverse on reading afu_version, else the string form */
-	/*     will be backwards */
-	reg = afu->afu_map->global.regs.afu_version;
-	memcpy(afu->version, &reg, 8);
+	/* No byte reverse on reading afu_version or string will be backwards */
+	reg = readq(&afu->afu_map->global.regs.afu_version);
+	memcpy(afu->version, &reg, sizeof(reg));
 	afu->interface_version =
 	    readq_be(&afu->afu_map->global.regs.interface_version);
-	pr_debug("%s: afu version %s, interface version 0x%llX\n",
-		 __func__, afu->version, afu->interface_version);
+	if ((afu->interface_version + 1) == 0) {
+		pr_err("Back level AFU, please upgrade. AFU version %s "
+		       "interface version 0x%llx\n", afu->version,
+		       afu->interface_version);
+		rc = -EINVAL;
+		goto err1;
+	} else
+		pr_debug("%s: afu version %s, interface version 0x%llX\n",
+			 __func__, afu->version, afu->interface_version);
 
 	rc = start_afu(cfg);
 	if (rc) {
diff --git a/drivers/scsi/cxlflash/sislite.h b/drivers/scsi/cxlflash/sislite.h
index 63bf394..8425d1a 100644
--- a/drivers/scsi/cxlflash/sislite.h
+++ b/drivers/scsi/cxlflash/sislite.h
@@ -340,7 +340,7 @@ struct sisl_global_regs {
 #define SISL_AFUCONF_MBOX_CLR_READ     0x0010ULL
 	__be64 afu_config;
 	__be64 rsvd[0xf8];
-	__be64 afu_version;
+	__le64 afu_version;
 	__be64 interface_version;
 };
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put()
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (18 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check Matthew R. Ochs
@ 2015-09-16 21:30 ` Matthew R. Ochs
  2015-09-22 20:53   ` Brian King
  2015-09-16 21:31 ` [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory Matthew R. Ochs
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:30 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Currently, scsi_host_put() is being called prematurely in the
remove path and is missing entirely in an error cleanup path.
The former can lead to memory being freed too early with
subsequent access potentially corrupting data whilst the former
would result in a memory leak.

Move the usage on remove to be the last cleanup action taken
and introduce a call to scsi_host_put() in the one initialization
error path that does not use remove to cleanup.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index fc77cd4..1856a73 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -734,7 +734,6 @@ static void cxlflash_remove(struct pci_dev *pdev)
 	case INIT_STATE_SCSI:
 		cxlflash_term_local_luns(cfg);
 		scsi_remove_host(cfg->host);
-		scsi_host_put(cfg->host);
 		/* Fall through */
 	case INIT_STATE_AFU:
 		term_afu(cfg);
@@ -744,6 +743,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
 	case INIT_STATE_NONE:
 		flush_work(&cfg->work_q);
 		free_mem(cfg);
+		scsi_host_put(cfg->host);
 		break;
 	}
 
@@ -2415,6 +2415,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
 		dev_err(&pdev->dev, "%s: call to scsi_host_alloc failed!\n",
 			__func__);
 		rc = -ENOMEM;
+		scsi_host_put(cfg->host);
 		goto out;
 	}
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (19 preceding siblings ...)
  2015-09-16 21:30 ` [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put() Matthew R. Ochs
@ 2015-09-16 21:31 ` Matthew R. Ochs
  2015-09-21 12:25   ` Tomas Henzl
  2015-09-16 21:31 ` [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH Matthew R. Ochs
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:31 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The workq can process work in parallel with a remove event, leading
to a condition where the workq handler can access freed memory.

To remedy, the workq should be terminated prior to freeing memory. Move
the termination call earlier in remove and use cancel_work_sync() instead
of flush_work() as there is not a need to process any scheduled work when
shutting down.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 1856a73..1625aea 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -736,12 +736,12 @@ static void cxlflash_remove(struct pci_dev *pdev)
 		scsi_remove_host(cfg->host);
 		/* Fall through */
 	case INIT_STATE_AFU:
+		cancel_work_sync(&cfg->work_q);
 		term_afu(cfg);
 	case INIT_STATE_PCI:
 		pci_release_regions(cfg->dev);
 		pci_disable_device(pdev);
 	case INIT_STATE_NONE:
-		flush_work(&cfg->work_q);
 		free_mem(cfg);
 		scsi_host_put(cfg->host);
 		break;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (20 preceding siblings ...)
  2015-09-16 21:31 ` [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory Matthew R. Ochs
@ 2015-09-16 21:31 ` Matthew R. Ochs
  2015-09-22 20:58   ` Brian King
  2015-09-16 21:31 ` [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests Matthew R. Ochs
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:31 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

When the device reset handler is entered while a reset operation
is taking place, the handler exits without actually sending a
reset (TMF) to the targeted device. This behavior is incorrect
as the device is not reset. Further complicating matters is the
fact that a success is returned even when the TMF was not sent.

To fix, the state is rechecked after coming out of the reset
state. When the state is normal, a TMF will be sent out.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 1625aea..922fac7 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1931,6 +1931,7 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
 		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
 
+retry:
 	switch (cfg->state) {
 	case STATE_NORMAL:
 		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
@@ -1939,9 +1940,7 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 		break;
 	case STATE_RESET:
 		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
-		if (cfg->state == STATE_NORMAL)
-			break;
-		/* fall through */
+		goto retry;
 	default:
 		rc = FAILED;
 		break;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (21 preceding siblings ...)
  2015-09-16 21:31 ` [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH Matthew R. Ochs
@ 2015-09-16 21:31 ` Matthew R. Ochs
  2015-09-22 20:59   ` Brian King
  2015-09-16 21:31 ` [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes Matthew R. Ochs
                   ` (6 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:31 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The host reset handler is called with I/O already blocked, thus
there is no need to explicitly block and unblock I/O in the handler.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 922fac7..24ff8dc 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1977,7 +1977,6 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 	switch (cfg->state) {
 	case STATE_NORMAL:
 		cfg->state = STATE_RESET;
-		scsi_block_requests(cfg->host);
 		cxlflash_mark_contexts_error(cfg);
 		rcr = afu_reset(cfg);
 		if (rcr) {
@@ -1986,7 +1985,6 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 		} else
 			cfg->state = STATE_NORMAL;
 		wake_up_all(&cfg->reset_waitq);
-		scsi_unblock_requests(cfg->host);
 		break;
 	case STATE_RESET:
 		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (22 preceding siblings ...)
  2015-09-16 21:31 ` [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests Matthew R. Ochs
@ 2015-09-16 21:31 ` Matthew R. Ochs
  2015-09-22 21:02   ` Brian King
  2015-09-16 21:32 ` [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors Matthew R. Ochs
                   ` (5 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:31 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Several function prologs have incorrect parameter names and return
code descriptions. This can lead to confusion when reviewing the
source and creates inaccurate documentation.

To remedy, update the function prologs to properly reflect parameter
names and return codes.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 68 ++++++++++++++++----------------------------
 1 file changed, 25 insertions(+), 43 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 24ff8dc..e22fc7e 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -401,8 +401,7 @@ static void wait_resp(struct afu *afu, struct afu_cmd *cmd)
  * @tmfcmd:	TMF command to send.
  *
  * Return:
- *	0 on success
- *	SCSI_MLQUEUE_HOST_BUSY when host is busy
+ *	0 on success or SCSI_MLQUEUE_HOST_BUSY
  */
 static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 {
@@ -491,9 +490,7 @@ static const char *cxlflash_driver_info(struct Scsi_Host *host)
  * @host:	SCSI host associated with device.
  * @scp:	SCSI command to send.
  *
- * Return:
- *	0 on success
- *	SCSI_MLQUEUE_HOST_BUSY when host is busy
+ * Return: 0 on success or SCSI_MLQUEUE_HOST_BUSY
  */
 static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 {
@@ -597,7 +594,7 @@ out:
 
 /**
  * cxlflash_wait_for_pci_err_recovery() - wait for error recovery during probe
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  */
 static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
 {
@@ -611,7 +608,7 @@ static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
 
 /**
  * free_mem() - free memory associated with the AFU
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  */
 static void free_mem(struct cxlflash_cfg *cfg)
 {
@@ -633,7 +630,7 @@ static void free_mem(struct cxlflash_cfg *cfg)
 
 /**
  * stop_afu() - stops the AFU command timers and unmaps the MMIO space
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * Safe to call with AFU in a partially allocated/initialized state.
  */
@@ -655,7 +652,7 @@ static void stop_afu(struct cxlflash_cfg *cfg)
 
 /**
  * term_mc() - terminates the master context
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  * @level:	Depth of allocation, where to begin waterfall tear down.
  *
  * Safe to call with AFU/MC in partially allocated/initialized state.
@@ -691,7 +688,7 @@ static void term_mc(struct cxlflash_cfg *cfg, enum undo_level level)
 
 /**
  * term_afu() - terminates the AFU
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * Safe to call with AFU/MC in partially allocated/initialized state.
  */
@@ -752,7 +749,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
 
 /**
  * alloc_mem() - allocates the AFU and its command pool
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * A partially allocated state remains on failure.
  *
@@ -805,12 +802,9 @@ out:
 
 /**
  * init_pci() - initializes the host as a PCI device
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
- * Return:
- *	0 on success
- *	-EIO on unable to communicate with device
- *	A return code from the PCI sub-routines
+ * Return: 0 on success, -errno on failure
  */
 static int init_pci(struct cxlflash_cfg *cfg)
 {
@@ -890,11 +884,9 @@ out_release_regions:
 
 /**
  * init_scsi() - adds the host to the SCSI stack and kicks off host scan
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
- * Return:
- *	0 on success
- *	A return code from adding the host
+ * Return: 0 on success, -errno on failure
  */
 static int init_scsi(struct cxlflash_cfg *cfg)
 {
@@ -1368,7 +1360,7 @@ out:
 
 /**
  * start_context() - starts the master context
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * Return: A success or failure value from CXL services.
  */
@@ -1386,12 +1378,10 @@ static int start_context(struct cxlflash_cfg *cfg)
 
 /**
  * read_vpd() - obtains the WWPNs from VPD
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  * @wwpn:	Array of size NUM_FC_PORTS to pass back WWPNs
  *
- * Return:
- *	0 on success
- *	-ENODEV when VPD or WWPN keywords not found
+ * Return: 0 on success, -errno on failure
  */
 static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
 {
@@ -1479,7 +1469,7 @@ out:
 
 /**
  * init_pcr() - initialize the provisioning and control registers
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * Also sets up fast access to the mapped registers and initializes AFU
  * command fields that never change.
@@ -1518,7 +1508,7 @@ static void init_pcr(struct cxlflash_cfg *cfg)
 
 /**
  * init_global() - initialize AFU global registers
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  */
 static int init_global(struct cxlflash_cfg *cfg)
 {
@@ -1603,7 +1593,7 @@ out:
 
 /**
  * start_afu() - initializes and starts the AFU
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  */
 static int start_afu(struct cxlflash_cfg *cfg)
 {
@@ -1637,12 +1627,9 @@ static int start_afu(struct cxlflash_cfg *cfg)
 
 /**
  * init_mc() - create and register as the master context
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
- * Return:
- *	0 on success
- *	-ENOMEM when unable to obtain a context from CXL services
- *	A failure value from CXL services.
+ * Return: 0 on success, -errno on failure
  */
 static int init_mc(struct cxlflash_cfg *cfg)
 {
@@ -1726,15 +1713,12 @@ out:
 
 /**
  * init_afu() - setup as master context and start AFU
- * @cxlflash:	Internal structure associated with the host.
+ * @cfg:	Internal structure associated with the host.
  *
  * This routine is a higher level of control for configuring the
  * AFU on probe and reset paths.
  *
- * Return:
- *	0 on success
- *	-ENOMEM when unable to map the AFU MMIO space
- *	A failure value from internal services.
+ * Return: 0 on success, -errno on failure
  */
 static int init_afu(struct cxlflash_cfg *cfg)
 {
@@ -1887,9 +1871,7 @@ out:
  * afu_reset() - resets the AFU
  * @cfg:	Internal structure associated with the host.
  *
- * Return:
- *	0 on success
- *	A failure value from internal services.
+ * Return: 0 on success, -errno on failure
  */
 static int afu_reset(struct cxlflash_cfg *cfg)
 {
@@ -2374,7 +2356,7 @@ static void cxlflash_worker_thread(struct work_struct *work)
  * @pdev:	PCI device associated with the host.
  * @dev_id:	PCI device id associated with device.
  *
- * Return: 0 on success / non-zero on failure
+ * Return: 0 on success, -errno on failure
  */
 static int cxlflash_probe(struct pci_dev *pdev,
 			  const struct pci_device_id *dev_id)
@@ -2608,7 +2590,7 @@ static struct pci_driver cxlflash_driver = {
 /**
  * init_cxlflash() - module entry point
  *
- * Return: 0 on success / non-zero on failure
+ * Return: 0 on success, -errno on failure
  */
 static int __init init_cxlflash(void)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (23 preceding siblings ...)
  2015-09-16 21:31 ` [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-23 15:03   ` Brian King
  2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Sparse uncovered several errors with MMIO operations (accessing
directly) and handling endianness. These can cause issues when
running in different environments.

Introduce __iomem and proper endianness tags/swaps where
appropriate to make driver sparse clean.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h    | 10 +++++-----
 drivers/scsi/cxlflash/main.c      | 25 +++++++++++++------------
 drivers/scsi/cxlflash/superpipe.c |  6 +++---
 drivers/scsi/cxlflash/superpipe.h |  2 +-
 drivers/scsi/cxlflash/vlun.c      |  4 ++--
 5 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index f392319..b893046 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -165,9 +165,9 @@ struct afu {
 
 	/* AFU HW */
 	struct cxl_ioctl_start_work work;
-	struct cxlflash_afu_map *afu_map;	/* entire MMIO map */
-	struct sisl_host_map *host_map;		/* MC host map */
-	struct sisl_ctrl_map *ctrl_map;		/* MC control map */
+	struct cxlflash_afu_map __iomem *afu_map;	/* entire MMIO map */
+	struct sisl_host_map __iomem *host_map;		/* MC host map */
+	struct sisl_ctrl_map __iomem *ctrl_map;		/* MC control map */
 
 	ctx_hndl_t ctx_hndl;	/* master's context handle */
 	u64 *hrrq_start;
@@ -189,10 +189,10 @@ struct afu {
 
 static inline u64 lun_to_lunid(u64 lun)
 {
-	u64 lun_id;
+	__be64 lun_id;
 
 	int_to_scsilun(lun, (struct scsi_lun *)&lun_id);
-	return swab64(lun_id);
+	return be64_to_cpu(lun_id);
 }
 
 int cxlflash_afu_sync(struct afu *, ctx_hndl_t, res_hndl_t, u8);
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index e22fc7e..770c515 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -644,7 +644,7 @@ static void stop_afu(struct cxlflash_cfg *cfg)
 			complete(&afu->cmd[i].cevent);
 
 		if (likely(afu->afu_map)) {
-			cxl_psa_unmap((void *)afu->afu_map);
+			cxl_psa_unmap((void __iomem *)afu->afu_map);
 			afu->afu_map = NULL;
 		}
 	}
@@ -915,7 +915,7 @@ out:
  * that the FC link layer has synced, completed the handshaking process, and
  * is ready for login to start.
  */
-static void set_port_online(u64 *fc_regs)
+static void set_port_online(__be64 __iomem *fc_regs)
 {
 	u64 cmdcfg;
 
@@ -931,7 +931,7 @@ static void set_port_online(u64 *fc_regs)
  *
  * The provided MMIO region must be mapped prior to call.
  */
-static void set_port_offline(u64 *fc_regs)
+static void set_port_offline(__be64 __iomem *fc_regs)
 {
 	u64 cmdcfg;
 
@@ -955,7 +955,7 @@ static void set_port_offline(u64 *fc_regs)
  *	FALSE (0) when the specified port fails to come online after timeout
  *	-EINVAL when @delay_us is less than 1000
  */
-static int wait_port_online(u64 *fc_regs, u32 delay_us, u32 nretry)
+static int wait_port_online(__be64 __iomem *fc_regs, u32 delay_us, u32 nretry)
 {
 	u64 status;
 
@@ -986,7 +986,7 @@ static int wait_port_online(u64 *fc_regs, u32 delay_us, u32 nretry)
  *	FALSE (0) when the specified port fails to go offline after timeout
  *	-EINVAL when @delay_us is less than 1000
  */
-static int wait_port_offline(u64 *fc_regs, u32 delay_us, u32 nretry)
+static int wait_port_offline(__be64 __iomem *fc_regs, u32 delay_us, u32 nretry)
 {
 	u64 status;
 
@@ -1021,7 +1021,8 @@ static int wait_port_offline(u64 *fc_regs, u32 delay_us, u32 nretry)
  *	0 when the WWPN is successfully written and the port comes back online
  *	-1 when the port fails to go offline or come back up online
  */
-static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
+static int afu_set_wwpn(struct afu *afu, int port, __be64 __iomem *fc_regs,
+			u64 wwpn)
 {
 	int rc = 0;
 
@@ -1066,7 +1067,7 @@ static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
  * the alternate port exclusively while the reset takes place.
  * failure to come online is overridden.
  */
-static void afu_link_reset(struct afu *afu, int port, u64 *fc_regs)
+static void afu_link_reset(struct afu *afu, int port, __be64 __iomem *fc_regs)
 {
 	u64 port_sel;
 
@@ -1288,7 +1289,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 	struct device *dev = &cfg->dev->dev;
 	u64 reg_unmasked;
 	const struct asyc_intr_info *info;
-	struct sisl_global_map *global = &afu->afu_map->global;
+	struct sisl_global_map __iomem *global = &afu->afu_map->global;
 	u64 reg;
 	u8 port;
 	int i;
@@ -1477,7 +1478,7 @@ out:
 static void init_pcr(struct cxlflash_cfg *cfg)
 {
 	struct afu *afu = cfg->afu;
-	struct sisl_ctrl_map *ctrl_map;
+	struct sisl_ctrl_map __iomem *ctrl_map;
 	int i;
 
 	for (i = 0; i < MAX_CONTEXT; i++) {
@@ -1766,7 +1767,7 @@ static int init_afu(struct cxlflash_cfg *cfg)
 		dev_err(dev, "%s: call to start_afu failed, rc=%d!\n",
 			__func__, rc);
 		term_mc(cfg, UNDO_START);
-		cxl_psa_unmap((void *)afu->afu_map);
+		cxl_psa_unmap((void __iomem *)afu->afu_map);
 		afu->afu_map = NULL;
 		goto err1;
 	}
@@ -1846,8 +1847,8 @@ retry:
 	cmd->rcb.cdb[1] = mode;
 
 	/* The cdb is aligned, no unaligned accessors required */
-	*((u16 *)&cmd->rcb.cdb[2]) = swab16(ctx_hndl_u);
-	*((u32 *)&cmd->rcb.cdb[4]) = swab32(res_hndl_u);
+	*((__be16 *)&cmd->rcb.cdb[2]) = cpu_to_be16(ctx_hndl_u);
+	*((__be32 *)&cmd->rcb.cdb[4]) = cpu_to_be32(res_hndl_u);
 
 	rc = send_cmd(afu, cmd);
 	if (unlikely(rc))
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 5d51c65..fb79b79fe 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -253,7 +253,7 @@ static int afu_attach(struct cxlflash_cfg *cfg, struct ctx_info *ctxi)
 {
 	struct device *dev = &cfg->dev->dev;
 	struct afu *afu = cfg->afu;
-	struct sisl_ctrl_map *ctrl_map = ctxi->ctrl_map;
+	struct sisl_ctrl_map __iomem *ctrl_map = ctxi->ctrl_map;
 	int rc = 0;
 	u64 val;
 
@@ -365,8 +365,8 @@ retry:
 	 * as the buffer is allocated on an aligned boundary.
 	 */
 	mutex_lock(&gli->mutex);
-	gli->max_lba = be64_to_cpu(*((u64 *)&cmd_buf[0]));
-	gli->blk_len = be32_to_cpu(*((u32 *)&cmd_buf[8]));
+	gli->max_lba = be64_to_cpu(*((__be64 *)&cmd_buf[0]));
+	gli->blk_len = be32_to_cpu(*((__be32 *)&cmd_buf[8]));
 	mutex_unlock(&gli->mutex);
 
 out:
diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h
index 7947091..7df88ee 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -91,7 +91,7 @@ enum ctx_ctrl {
 #define DECODE_CTXID(_val)	(_val & 0xFFFFFFFF)
 
 struct ctx_info {
-	struct sisl_ctrl_map *ctrl_map; /* initialized at startup */
+	struct sisl_ctrl_map __iomem *ctrl_map; /* initialized at startup */
 	struct sisl_rht_entry *rht_start; /* 1 page (req'd for alignment),
 					     alloc/free on attach/detach */
 	u32 rht_out;		/* Number of checked out RHT entries */
diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index 96b074f..f91b5b3 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -786,7 +786,7 @@ void cxlflash_restore_luntable(struct cxlflash_cfg *cfg)
 	u32 chan;
 	u32 lind;
 	struct afu *afu = cfg->afu;
-	struct sisl_global_map *agm = &afu->afu_map->global;
+	struct sisl_global_map __iomem *agm = &afu->afu_map->global;
 
 	mutex_lock(&global.mutex);
 
@@ -831,7 +831,7 @@ static int init_luntable(struct cxlflash_cfg *cfg, struct llun_info *lli)
 	u32 lind;
 	int rc = 0;
 	struct afu *afu = cfg->afu;
-	struct sisl_global_map *agm = &afu->afu_map->global;
+	struct sisl_global_map __iomem *agm = &afu->afu_map->global;
 
 	mutex_lock(&global.mutex);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (24 preceding siblings ...)
  2015-09-16 21:32 ` [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-23 19:09   ` Brian King
  2015-09-16 21:32 ` [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes Matthew R. Ochs
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The process_sense() routine can perform a read capacity which
can take some time to complete. If an EEH occurs while waiting
on the read capacity, the EEH handler is unable to obtain the
context's mutex in order to put the context in an error state.
The EEH handler will sit and wait until the context is free,
but this wait can last longer than the EEH handler tolerates,
leading to a failed recovery.

To address this issue, make the context unavailable to new,
non-system owned threads and release the context while calling
into process_sense(). After returning from process_sense() the
context mutex is reacquired and the context is made available
again. The context can be safely moved to the error state if
needed during the unavailable window as no other threads will
hold its reference.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index fb79b79fe..1c5e9ac 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1790,12 +1790,21 @@ static int cxlflash_disk_verify(struct scsi_device *sdev,
 	 * inquiry (i.e. the Unit attention is due to the WWN changing).
 	 */
 	if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) {
+		/* Can't hold mutex across process_sense/read_cap16,
+		 * since we could have an intervening EEH event.
+		 */
+		ctxi->unavail = true;
+		mutex_unlock(&ctxi->mutex);
 		rc = process_sense(sdev, verify);
 		if (unlikely(rc)) {
 			dev_err(dev, "%s: Failed to validate sense data (%d)\n",
 				__func__, rc);
+			mutex_lock(&ctxi->mutex);
+			ctxi->unavail = false;
 			goto out;
 		}
+		mutex_lock(&ctxi->mutex);
+		ctxi->unavail = false;
 	}
 
 	switch (gli->mode) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (25 preceding siblings ...)
  2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-23 19:13   ` Brian King
  2015-09-16 21:32 ` [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

There are several spelling and grammar mistakes throughout the
driver. Additionally there are a handful of places where there
are extra lines and unnecessary variables/statements. These are
a nuisance and pollute the driver.

Fix spelling and grammar issues. Update some comments for clarity and
consistency. Remove extra lines and a few unneeded variables/statements.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h    |  2 --
 drivers/scsi/cxlflash/main.c      | 62 +++++++++++++++++----------------------
 drivers/scsi/cxlflash/sislite.h   |  6 ++--
 drivers/scsi/cxlflash/superpipe.c |  2 +-
 drivers/scsi/cxlflash/vlun.c      | 14 ++++-----
 5 files changed, 38 insertions(+), 48 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index b893046..e6041b9 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -106,8 +106,6 @@ struct cxlflash_cfg {
 	atomic_t remove_active;
 
 	struct cxl_afu *cxl_afu;
-
-	struct pci_pool *cxlflash_cmd_pool;
 	struct pci_dev *parent_dev;
 
 	atomic_t recovery_threads;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 770c515..a5b45ed 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -34,7 +34,6 @@ MODULE_AUTHOR("Manoj N. Kumar <manoj@linux.vnet.ibm.com>");
 MODULE_AUTHOR("Matthew R. Ochs <mrochs@linux.vnet.ibm.com>");
 MODULE_LICENSE("GPL");
 
-
 /**
  * cmd_checkout() - checks out an AFU command
  * @afu:	AFU to checkout from.
@@ -731,7 +730,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
 	case INIT_STATE_SCSI:
 		cxlflash_term_local_luns(cfg);
 		scsi_remove_host(cfg->host);
-		/* Fall through */
+		/* fall through */
 	case INIT_STATE_AFU:
 		cancel_work_sync(&cfg->work_q);
 		term_afu(cfg);
@@ -764,9 +763,7 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 	char *buf = NULL;
 	struct device *dev = &cfg->dev->dev;
 
-	/* This allocation is about 12K, i.e. only 1 64k page
-	 * and upto 4 4k pages
-	 */
+	/* AFU is ~12k, i.e. only one 64k page or up to four 4k pages */
 	cfg->afu = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
 					    get_order(sizeof(struct afu)));
 	if (unlikely(!cfg->afu)) {
@@ -1306,10 +1303,10 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 		goto out;
 	}
 
-	/* it is OK to clear AFU status before FC_ERROR */
+	/* FYI, it is 'okay' to clear AFU status before FC_ERROR */
 	writeq_be(reg_unmasked, &global->regs.aintr_clear);
 
-	/* check each bit that is on */
+	/* Check each bit that is on */
 	for (i = 0; reg_unmasked; i++, reg_unmasked = (reg_unmasked >> 1)) {
 		info = find_ainfo(1ULL << i);
 		if (((reg_unmasked & 0x1) == 0) || !info)
@@ -1322,7 +1319,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 		       readq_be(&global->fc_regs[port][FC_STATUS / 8]));
 
 		/*
-		 * do link reset first, some OTHER errors will set FC_ERROR
+		 * Do link reset first, some OTHER errors will set FC_ERROR
 		 * again if cleared before or w/o a reset
 		 */
 		if (info->action & LINK_RESET) {
@@ -1337,7 +1334,7 @@ static irqreturn_t cxlflash_async_err_irq(int irq, void *data)
 			reg = readq_be(&global->fc_regs[port][FC_ERROR / 8]);
 
 			/*
-			 * since all errors are unmasked, FC_ERROR and FC_ERRCAP
+			 * Since all errors are unmasked, FC_ERROR and FC_ERRCAP
 			 * should be the same and tracing one is sufficient.
 			 */
 
@@ -1483,23 +1480,22 @@ static void init_pcr(struct cxlflash_cfg *cfg)
 
 	for (i = 0; i < MAX_CONTEXT; i++) {
 		ctrl_map = &afu->afu_map->ctrls[i].ctrl;
-		/* disrupt any clients that could be running */
-		/* e. g. clients that survived a master restart */
+		/* Disrupt any clients that could be running */
+		/* e.g. clients that survived a master restart */
 		writeq_be(0, &ctrl_map->rht_start);
 		writeq_be(0, &ctrl_map->rht_cnt_id);
 		writeq_be(0, &ctrl_map->ctx_cap);
 	}
 
-	/* copy frequently used fields into afu */
+	/* Copy frequently used fields into afu */
 	afu->ctx_hndl = (u16) cxl_process_element(cfg->mcctx);
-	/* ctx_hndl is 16 bits in CAIA */
 	afu->host_map = &afu->afu_map->hosts[afu->ctx_hndl].host;
 	afu->ctrl_map = &afu->afu_map->ctrls[afu->ctx_hndl].ctrl;
 
 	/* Program the Endian Control for the master context */
 	writeq_be(SISL_ENDIAN_CTRL, &afu->host_map->endian_ctrl);
 
-	/* initialize cmd fields that never change */
+	/* Initialize cmd fields that never change */
 	for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
 		afu->cmd[i].rcb.ctx_id = afu->ctx_hndl;
 		afu->cmd[i].rcb.msi = SISL_MSI_RRQ_UPDATED;
@@ -1528,7 +1524,7 @@ static int init_global(struct cxlflash_cfg *cfg)
 
 	pr_debug("%s: wwpn0=0x%llX wwpn1=0x%llX\n", __func__, wwpn[0], wwpn[1]);
 
-	/* set up RRQ in AFU for master issued cmds */
+	/* Set up RRQ in AFU for master issued cmds */
 	writeq_be((u64) afu->hrrq_start, &afu->host_map->rrq_start);
 	writeq_be((u64) afu->hrrq_end, &afu->host_map->rrq_end);
 
@@ -1541,9 +1537,9 @@ static int init_global(struct cxlflash_cfg *cfg)
 	/* checker on if dual afu */
 	writeq_be(reg, &afu->afu_map->global.regs.afu_config);
 
-	/* global port select: select either port */
+	/* Global port select: select either port */
 	if (afu->internal_lun) {
-		/* only use port 0 */
+		/* Only use port 0 */
 		writeq_be(PORT0, &afu->afu_map->global.regs.afu_port_sel);
 		num_ports = NUM_FC_PORTS - 1;
 	} else {
@@ -1552,15 +1548,15 @@ static int init_global(struct cxlflash_cfg *cfg)
 	}
 
 	for (i = 0; i < num_ports; i++) {
-		/* unmask all errors (but they are still masked at AFU) */
+		/* Unmask all errors (but they are still masked at AFU) */
 		writeq_be(0, &afu->afu_map->global.fc_regs[i][FC_ERRMSK / 8]);
-		/* clear CRC error cnt & set a threshold */
+		/* Clear CRC error cnt & set a threshold */
 		(void)readq_be(&afu->afu_map->global.
 			       fc_regs[i][FC_CNT_CRCERR / 8]);
 		writeq_be(MC_CRC_THRESH, &afu->afu_map->global.fc_regs[i]
 			  [FC_CRC_THRESH / 8]);
 
-		/* set WWPNs. If already programmed, wwpn[i] is 0 */
+		/* Set WWPNs. If already programmed, wwpn[i] is 0 */
 		if (wwpn[i] != 0 &&
 		    afu_set_wwpn(afu, i,
 				 &afu->afu_map->global.fc_regs[i][0],
@@ -1574,18 +1570,17 @@ static int init_global(struct cxlflash_cfg *cfg)
 		 * offline/online transitions and a PLOGI
 		 */
 		msleep(100);
-
 	}
 
-	/* set up master's own CTX_CAP to allow real mode, host translation */
-	/* tbls, afu cmds and read/write GSCSI cmds. */
+	/* Set up master's own CTX_CAP to allow real mode, host translation */
+	/* tables, afu cmds and read/write GSCSI cmds. */
 	/* First, unlock ctx_cap write by reading mbox */
 	(void)readq_be(&afu->ctrl_map->mbox_r);	/* unlock ctx_cap */
 	writeq_be((SISL_CTX_CAP_REAL_MODE | SISL_CTX_CAP_HOST_XLATE |
 		   SISL_CTX_CAP_READ_CMD | SISL_CTX_CAP_WRITE_CMD |
 		   SISL_CTX_CAP_AFU_CMD | SISL_CTX_CAP_GSCSI_CMD),
 		  &afu->ctrl_map->ctx_cap);
-	/* init heartbeat */
+	/* Initialize heartbeat */
 	afu->hb = readq_be(&afu->afu_map->global.regs.afu_hb);
 
 out:
@@ -1614,7 +1609,7 @@ static int start_afu(struct cxlflash_cfg *cfg)
 
 	init_pcr(cfg);
 
-	/* initialize RRQ pointers */
+	/* Initialize RRQ pointers */
 	afu->hrrq_start = &afu->rrq_entry[0];
 	afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];
 	afu->hrrq_curr = afu->hrrq_start;
@@ -1737,8 +1732,7 @@ static int init_afu(struct cxlflash_cfg *cfg)
 		goto err1;
 	}
 
-	/* Map the entire MMIO space of the AFU.
-	 */
+	/* Map the entire MMIO space of the AFU */
 	afu->afu_map = cxl_psa_map(cfg->mcctx);
 	if (!afu->afu_map) {
 		rc = -ENOMEM;
@@ -1790,7 +1784,7 @@ err1:
  * @mode:	Type of sync to issue (lightweight, heavyweight, global).
  *
  * The AFU can only take 1 sync command at a time. This routine enforces this
- * limitation by using a mutex to provide exlusive access to the AFU during
+ * limitation by using a mutex to provide exclusive access to the AFU during
  * the sync. This design point requires calling threads to not be on interrupt
  * context due to the possibility of sleeping during concurrent sync operations.
  *
@@ -1856,7 +1850,7 @@ retry:
 
 	wait_resp(afu, cmd);
 
-	/* set on timeout */
+	/* Set on timeout */
 	if (unlikely((cmd->sa.ioasc != 0) ||
 		     (cmd->sa.host_use_b[0] & B_ERROR)))
 		rc = -1;
@@ -2273,7 +2267,7 @@ static struct scsi_host_template driver_template = {
 	.cmd_per_lun = 16,
 	.can_queue = CXLFLASH_MAX_CMDS,
 	.this_id = -1,
-	.sg_tablesize = SG_NONE,	/* No scatter gather support. */
+	.sg_tablesize = SG_NONE,	/* No scatter gather support */
 	.max_sectors = CXLFLASH_MAX_SECTORS,
 	.use_clustering = ENABLE_CLUSTERING,
 	.shost_attrs = cxlflash_host_attrs,
@@ -2333,8 +2327,7 @@ static void cxlflash_worker_thread(struct work_struct *work)
 
 			/* The reset can block... */
 			afu_link_reset(afu, port,
-				       &afu->afu_map->
-				       global.fc_regs[port][0]);
+				       &afu->afu_map->global.fc_regs[port][0]);
 			spin_lock_irqsave(cfg->host->host_lock, lock_flags);
 		}
 
@@ -2413,7 +2406,6 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	cfg->last_lun_index[1] = CXLFLASH_NUM_VLUNS/2 - 1;
 
 	cfg->dev_id = (struct pci_device_id *)dev_id;
-	cfg->mcctx = NULL;
 
 	init_waitqueue_head(&cfg->tmf_waitq);
 	init_waitqueue_head(&cfg->reset_waitq);
@@ -2429,7 +2421,8 @@ static int cxlflash_probe(struct pci_dev *pdev,
 
 	pci_set_drvdata(pdev, cfg);
 
-	/* Use the special service provided to look up the physical
+	/*
+	 * Use the special service provided to look up the physical
 	 * PCI device, since we are called on the probe of the virtual
 	 * PCI host bus (vphb)
 	 */
@@ -2459,7 +2452,6 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	}
 	cfg->init_state = INIT_STATE_AFU;
 
-
 	rc = init_scsi(cfg);
 	if (rc) {
 		dev_err(&pdev->dev, "%s: call to init_scsi "
diff --git a/drivers/scsi/cxlflash/sislite.h b/drivers/scsi/cxlflash/sislite.h
index 8425d1a..0b3366f 100644
--- a/drivers/scsi/cxlflash/sislite.h
+++ b/drivers/scsi/cxlflash/sislite.h
@@ -146,7 +146,7 @@ struct sisl_rc {
 #define SISL_FC_RC_ABORTFAIL	0x59	/* pending abort completed w/fail */
 #define SISL_FC_RC_RESID	0x5A	/* ioasa underrun/overrun flags set */
 #define SISL_FC_RC_RESIDERR	0x5B	/* actual data len does not match SCSI
-					   reported len, possbly due to dropped
+					   reported len, possibly due to dropped
 					   frames */
 #define SISL_FC_RC_TGTABORT	0x5C	/* command aborted by target */
 };
@@ -258,7 +258,7 @@ struct sisl_host_map {
 	__be64 rrq_start;	/* start & end are both inclusive */
 	__be64 rrq_end;		/* write sequence: start followed by end */
 	__be64 cmd_room;
-	__be64 ctx_ctrl;	/* least signiifcant byte or b56:63 is LISN# */
+	__be64 ctx_ctrl;	/* least significant byte or b56:63 is LISN# */
 	__be64 mbox_w;		/* restricted use */
 };
 
@@ -290,7 +290,7 @@ struct sisl_global_regs {
 #define SISL_ASTATUS_FC0_LOGO    0x4000ULL /* b49, target sent FLOGI/PLOGI/LOGO
 						   while logged in */
 #define SISL_ASTATUS_FC0_CRC_T   0x2000ULL /* b50, CRC threshold exceeded */
-#define SISL_ASTATUS_FC0_LOGI_R  0x1000ULL /* b51, login state mechine timed out
+#define SISL_ASTATUS_FC0_LOGI_R  0x1000ULL /* b51, login state machine timed out
 						   and retrying */
 #define SISL_ASTATUS_FC0_LOGI_F  0x0800ULL /* b52, login failed,
 					      FC_ERROR[19:0] */
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 1c5e9ac..9844788 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -76,7 +76,7 @@ void cxlflash_free_errpage(void)
  *
  * When the host needs to go down, all users must be quiesced and their
  * memory freed. This is accomplished by putting the contexts in error
- * state which will notify the user and let them 'drive' the tear-down.
+ * state which will notify the user and let them 'drive' the tear down.
  * Meanwhile, this routine camps until all user contexts have been removed.
  */
 void cxlflash_stop_term_user_contexts(struct cxlflash_cfg *cfg)
diff --git a/drivers/scsi/cxlflash/vlun.c b/drivers/scsi/cxlflash/vlun.c
index f91b5b3..b0eaf55 100644
--- a/drivers/scsi/cxlflash/vlun.c
+++ b/drivers/scsi/cxlflash/vlun.c
@@ -132,7 +132,7 @@ static int ba_init(struct ba_lun *ba_lun)
 		return -ENOMEM;
 	}
 
-	/* Pass the allocated lun info as a handle to the user */
+	/* Pass the allocated LUN info as a handle to the user */
 	ba_lun->ba_lun_handle = bali;
 
 	pr_debug("%s: Successfully initialized the LUN: "
@@ -165,7 +165,7 @@ static int find_free_range(u32 low,
 			num_bits = (sizeof(*lam) * BITS_PER_BYTE);
 			bit_pos = find_first_bit(lam, num_bits);
 
-			pr_devel("%s: Found free bit %llX in lun "
+			pr_devel("%s: Found free bit %llX in LUN "
 				 "map entry %llX at bitmap index = %X\n",
 				 __func__, bit_pos, bali->lun_alloc_map[i],
 				 i);
@@ -682,14 +682,14 @@ out:
 }
 
 /**
- * _cxlflash_vlun_resize() - changes the size of a virtual lun
+ * _cxlflash_vlun_resize() - changes the size of a virtual LUN
  * @sdev:	SCSI device associated with LUN owning virtual LUN.
  * @ctxi:	Context owning resources.
  * @resize:	Resize ioctl data structure.
  *
  * On successful return, the user is informed of the new size (in blocks)
- * of the virtual lun in last LBA format. When the size of the virtual
- * lun is zero, the last LBA is reflected as -1. See comment in the
+ * of the virtual LUN in last LBA format. When the size of the virtual
+ * LUN is zero, the last LBA is reflected as -1. See comment in the
  * prologue for _cxlflash_disk_release() regarding AFU syncs and contexts
  * on the error recovery list.
  *
@@ -886,8 +886,8 @@ out:
  * @arg:	UVirtual ioctl data structure.
  *
  * On successful return, the user is informed of the resource handle
- * to be used to identify the virtual lun and the size (in blocks) of
- * the virtual lun in last LBA format. When the size of the virtual lun
+ * to be used to identify the virtual LUN and the size (in blocks) of
+ * the virtual LUN in last LBA format. When the size of the virtual LUN
  * is zero, the last LBA is reflected as -1.
  *
  * Return: 0 on success, -errno on failure
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (26 preceding siblings ...)
  2015-09-16 21:32 ` [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-23 19:18   ` Brian King
  2015-09-16 21:32 ` [PATCH v2 29/30] cxlflash: Fix to avoid state change collision Matthew R. Ochs
  2015-09-16 21:33 ` [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver Matthew R. Ochs
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Following an adapter reset, the AFU RRQ that resides in host memory
holds stale data. This can lead to a condition where the RRQ interrupt
handler tries to process stale entries and/or endlessly loops due to an
out of sync generation bit.

To fix, the AFU RRQ in host memory needs to be cleared after each reset.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index a5b45ed..0487fac 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1609,6 +1609,9 @@ static int start_afu(struct cxlflash_cfg *cfg)
 
 	init_pcr(cfg);
 
+	/* After an AFU reset, RRQ entries are stale, clear them */
+	memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry));
+
 	/* Initialize RRQ pointers */
 	afu->hrrq_start = &afu->rrq_entry[0];
 	afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 29/30] cxlflash: Fix to avoid state change collision
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (27 preceding siblings ...)
  2015-09-16 21:32 ` [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs
@ 2015-09-16 21:32 ` Matthew R. Ochs
  2015-09-21 12:44   ` Tomas Henzl
  2015-09-16 21:33 ` [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver Matthew R. Ochs
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:32 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

The adapter state machine is susceptible to missing and/or
corrupting state updates at runtime. This can lead to a variety
of unintended issues and is due to the lack of a serialization
mechanism to protect the adapter state.

Use an adapter-wide mutex to serialize state changes.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h    |  1 +
 drivers/scsi/cxlflash/main.c      | 40 +++++++++++++++++++++++++++++++++------
 drivers/scsi/cxlflash/superpipe.c |  7 ++++++-
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index e6041b9..c9b1ec6 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -128,6 +128,7 @@ struct cxlflash_cfg {
 	bool tmf_active;
 	wait_queue_head_t reset_waitq;
 	enum cxlflash_state state;
+	struct mutex mutex;
 };
 
 struct afu_cmd {
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 0487fac..a94340d 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -496,6 +496,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
 	struct afu *afu = cfg->afu;
 	struct device *dev = &cfg->dev->dev;
+	enum cxlflash_state state;
 	struct afu_cmd *cmd;
 	u32 port_sel = scp->device->channel + 1;
 	int nseg, i, ncount;
@@ -525,7 +526,11 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	}
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
-	switch (cfg->state) {
+	mutex_lock(&cfg->mutex);
+	state = cfg->state;
+	mutex_unlock(&cfg->mutex);
+
+	switch (state) {
 	case STATE_RESET:
 		dev_dbg_ratelimited(dev, "%s: device is in reset!\n", __func__);
 		rc = SCSI_MLQUEUE_HOST_BUSY;
@@ -722,7 +727,9 @@ static void cxlflash_remove(struct pci_dev *pdev)
 						  cfg->tmf_slock);
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
+	mutex_lock(&cfg->mutex);
 	cfg->state = STATE_FAILTERM;
+	mutex_unlock(&cfg->mutex);
 	atomic_inc(&cfg->remove_active);
 	cxlflash_stop_term_user_contexts(cfg);
 
@@ -1811,12 +1818,13 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	int retry_cnt = 0;
 	static DEFINE_MUTEX(sync_active);
 
+	mutex_lock(&sync_active);
+	mutex_lock(&cfg->mutex);
 	if (cfg->state != STATE_NORMAL) {
 		pr_debug("%s: Sync not required! (%u)\n", __func__, cfg->state);
-		return 0;
+		goto out;
 	}
 
-	mutex_lock(&sync_active);
 retry:
 	cmd = cmd_checkout(afu);
 	if (unlikely(!cmd)) {
@@ -1858,6 +1866,7 @@ retry:
 		     (cmd->sa.host_use_b[0] & B_ERROR)))
 		rc = -1;
 out:
+	mutex_unlock(&cfg->mutex);
 	mutex_unlock(&sync_active);
 	if (cmd)
 		cmd_checkin(cmd);
@@ -1900,6 +1909,7 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 	struct Scsi_Host *host = scp->device->host;
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
 	struct afu *afu = cfg->afu;
+	enum cxlflash_state state;
 	int rcr = 0;
 
 	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
@@ -1912,7 +1922,11 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
 
 retry:
-	switch (cfg->state) {
+	mutex_lock(&cfg->mutex);
+	state = cfg->state;
+	mutex_unlock(&cfg->mutex);
+
+	switch (state) {
 	case STATE_NORMAL:
 		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
 		if (unlikely(rcr))
@@ -1954,6 +1968,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
 		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
 
+	mutex_lock(&cfg->mutex);
 	switch (cfg->state) {
 	case STATE_NORMAL:
 		cfg->state = STATE_RESET;
@@ -1967,7 +1982,9 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 		wake_up_all(&cfg->reset_waitq);
 		break;
 	case STATE_RESET:
+		mutex_unlock(&cfg->mutex);
 		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
+		mutex_lock(&cfg->mutex);
 		if (cfg->state == STATE_NORMAL)
 			break;
 		/* fall through */
@@ -1975,6 +1992,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
 		rc = FAILED;
 		break;
 	}
+	mutex_unlock(&cfg->mutex);
 
 	pr_debug("%s: returning rc=%d\n", __func__, rc);
 	return rc;
@@ -2312,10 +2330,11 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	int port;
 	ulong lock_flags;
 
-	/* Avoid MMIO if the device has failed */
+	mutex_lock(&cfg->mutex);
 
+	/* Avoid MMIO if the device has failed */
 	if (cfg->state != STATE_NORMAL)
-		return;
+		goto out;
 
 	spin_lock_irqsave(cfg->host->host_lock, lock_flags);
 
@@ -2346,6 +2365,8 @@ static void cxlflash_worker_thread(struct work_struct *work)
 
 	if (atomic_dec_if_positive(&cfg->scan_host_needed) >= 0)
 		scsi_scan_host(cfg->host);
+out:
+	mutex_unlock(&cfg->mutex);
 }
 
 /**
@@ -2416,6 +2437,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
 	INIT_WORK(&cfg->work_q, cxlflash_worker_thread);
 	cfg->lr_state = LINK_RESET_INVALID;
 	cfg->lr_port = -1;
+	mutex_init(&cfg->mutex);
 	mutex_init(&cfg->ctx_tbl_list_mutex);
 	mutex_init(&cfg->ctx_recovery_mutex);
 	init_rwsem(&cfg->ioctl_rwsem);
@@ -2503,7 +2525,9 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
 
 	switch (state) {
 	case pci_channel_io_frozen:
+		mutex_lock(&cfg->mutex);
 		cfg->state = STATE_RESET;
+		mutex_unlock(&cfg->mutex);
 		scsi_block_requests(cfg->host);
 		drain_ioctls(cfg);
 		rc = cxlflash_mark_contexts_error(cfg);
@@ -2514,7 +2538,9 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
 		stop_afu(cfg);
 		return PCI_ERS_RESULT_NEED_RESET;
 	case pci_channel_io_perm_failure:
+		mutex_lock(&cfg->mutex);
 		cfg->state = STATE_FAILTERM;
+		mutex_unlock(&cfg->mutex);
 		wake_up_all(&cfg->reset_waitq);
 		scsi_unblock_requests(cfg->host);
 		return PCI_ERS_RESULT_DISCONNECT;
@@ -2561,7 +2587,9 @@ static void cxlflash_pci_resume(struct pci_dev *pdev)
 
 	dev_dbg(dev, "%s: pdev=%p\n", __func__, pdev);
 
+	mutex_lock(&cfg->mutex);
 	cfg->state = STATE_NORMAL;
+	mutex_unlock(&cfg->mutex);
 	wake_up_all(&cfg->reset_waitq);
 	scsi_unblock_requests(cfg->host);
 }
diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
index 9844788..c3aaadf 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1229,10 +1229,15 @@ static const struct file_operations null_fops = {
 static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
 {
 	struct device *dev = &cfg->dev->dev;
+	enum cxlflash_state state;
 	int rc = 0;
 
 retry:
-	switch (cfg->state) {
+	mutex_lock(&cfg->mutex);
+	state = cfg->state;
+	mutex_unlock(&cfg->mutex);
+
+	switch (state) {
 	case STATE_RESET:
 		dev_dbg(dev, "%s: Reset state, going to wait...\n", __func__);
 		if (ioctl)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver
  2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
                   ` (28 preceding siblings ...)
  2015-09-16 21:32 ` [PATCH v2 29/30] cxlflash: Fix to avoid state change collision Matthew R. Ochs
@ 2015-09-16 21:33 ` Matthew R. Ochs
  2015-09-23 19:19   ` Brian King
  29 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 21:33 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Add stanza for cxlflash SCSI driver.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 310da42..b0b2c3f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3153,6 +3153,15 @@ F:	Documentation/powerpc/cxl.txt
 F:	Documentation/powerpc/cxl.txt
 F:	Documentation/ABI/testing/sysfs-class-cxl
 
+CXLFLASH (IBM Coherent Accelerator Processor Interface CAPI Flash) SCSI DRIVER
+M:	Manoj N. Kumar <manoj@linux.vnet.ibm.com>
+M:	Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
+L:	linux-scsi@vger.kernel.org
+S:	Supported
+F:	drivers/scsi/cxlflash/
+F:	include/uapi/scsi/cxlflash_ioctls.h
+F:	Documentation/powerpc/cxlflash.txt
+
 STMMAC ETHERNET DRIVER
 M:	Giuseppe Cavallaro <peppe.cavallaro@st.com>
 L:	netdev@vger.kernel.org
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* RE: [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove
  2015-09-16 21:28 ` [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove Matthew R. Ochs
@ 2015-09-17 11:58   ` David Laight
  2015-09-17 16:55     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: David Laight @ 2015-09-17 11:58 UTC (permalink / raw)
  To: 'Matthew R. Ochs',
	linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan
  Cc: Michael Neuling, Manoj N. Kumar, linuxppc-dev

RnJvbTogTGludXhwcGMtZGV2IE1hdHRoZXcgUi4gT2Nocw0KPiBTZW50OiAxNiBTZXB0ZW1iZXIg
MjAxNSAyMjoyOA0KPiBJbnRlcnJ1cHQgcHJvY2Vzc2luZyBjYW4gcnVuIGluIHBhcmFsbGVsIHRv
IGEgcmVtb3ZlIG9wZXJhdGlvbi4gVGhpcw0KPiBjYW4gbGVhZCB0byBhIGNvbmRpdGlvbiB3aGVy
ZSB0aGUgaW50ZXJydXB0IGhhbmRsZXIgaXMgcHJvY2Vzc2luZyB3aXRoDQo+IG1lbW9yeSB0aGF0
IGhhcyBiZWVuIGZyZWVkLg0KPiANCj4gVG8gYXZvaWQgcHJvY2Vzc2luZyBhbiBpbnRlcnJ1cHQg
d2hpbGUgbWVtb3J5IG1heSBiZSB5YW5rZWQsIGNoZWNrIGZvcg0KPiByZW1vdmFsIHdoaWxlIGlu
IHRoZSBpbnRlcnJ1cHQgaGFuZGxlci4gQmFpbCB3aGVuIHJlbW92YWwgaXMgaW1taW5lbnQuDQoN
Ck9uIHRoZSBmYWNlIG9mIGl0IHRoaXMganVzdCByZWR1Y2VzIHRoZSBzaXplIG9mIHRoZSB3aW5k
b3cgc29tZXdoYXQuDQoNCldoYXQgaGFwcGVucyBpZiB0aGUgaW50ZXJydXB0IHJvdXRpbmUgcmVh
ZHMgdGhlIGZsYWcganVzdCBiZWZvcmUgaXQgaXMgc2V0DQooc28gaXMgcHJvY2Vzc2luZyB0aGUg
ZW50cnkgdGhhdCBpcyBiZWluZyByZW1vdmVkKSBhbmQgaXMgdGhlbiAoc2F5KQ0KaW50ZXJydXB0
ZWQgYnkgYSBoaWdoZXIgcHJpb3JpdHkgaW50ZXJydXB0IHRoYXQgdGFrZXMgbG9uZ2VyIHRvIGV4
ZWN1dGUgdGhhbg0KdGhlIHJlbW92ZSBjb2RlPw0KDQpZb3UndmUgc3RpbGwgZ290IGFuIGludGVy
cnVwdCByb3V0aW5lIGFjY2Vzc2luZyBmcmVlZCBtZW1vcnkuDQoNCglEYXZpZA0KDQo=

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove
  2015-09-17 11:58   ` David Laight
@ 2015-09-17 16:55     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-17 16:55 UTC (permalink / raw)
  To: David Laight
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	Manoj N. Kumar, linuxppc-dev

> On Sep 17, 2015, at 6:58 AM, David Laight <David.Laight@ACULAB.COM> =
wrote:
>=20
> From: Linuxppc-dev Matthew R. Ochs
>> Sent: 16 September 2015 22:28
>> Interrupt processing can run in parallel to a remove operation. This
>> can lead to a condition where the interrupt handler is processing =
with
>> memory that has been freed.
>>=20
>> To avoid processing an interrupt while memory may be yanked, check =
for
>> removal while in the interrupt handler. Bail when removal is =
imminent.
>=20
> On the face of it this just reduces the size of the window somewhat.

Agreed.

>=20
> What happens if the interrupt routine reads the flag just before it is =
set
> (so is processing the entry that is being removed) and is then (say)
> interrupted by a higher priority interrupt that takes longer to =
execute than
> the remove code?

Understood. To completely close we'd need to either introduce a lock or =
a
reciprocal flag/count such that the remove doesn't make forward progress
until after interrupt processing has completed. I can look at =
introducing such
a mechanism in a later patch to fully remove the exposure.


-matt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value
  2015-09-16 21:25 ` [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
@ 2015-09-18  1:16   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:16 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 02/30] cxlflash: Replace magic numbers with literals
  2015-09-16 21:26 ` [PATCH v2 02/30] cxlflash: Replace magic numbers with literals Matthew R. Ochs
@ 2015-09-18  1:18   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:18 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 03/30] cxlflash: Fix read capacity timeout
  2015-09-16 21:26 ` [PATCH v2 03/30] cxlflash: Fix read capacity timeout Matthew R. Ochs
@ 2015-09-18  1:21   ` Brian King
  2015-09-21 11:36   ` Tomas Henzl
  1 sibling, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:21 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

On 09/16/2015 04:26 PM, Matthew R. Ochs wrote:
> @@ -296,7 +296,7 @@ static int read_cap16(struct scsi_device *sdev, struct llun_info *lli)
>  	int rc = 0;
>  	int result = 0;
>  	int retry_cnt = 0;
> -	u32 tout = (MC_DISCOVERY_TIMEOUT * HZ);
> +	u32 to = (CMD_TIMEOUT * HZ);
> 
>  retry:
>  	cmd_buf = kzalloc(CMD_BUFSIZE, GFP_KERNEL);
> @@ -315,8 +315,7 @@ retry:
>  		retry_cnt ? "re" : "", scsi_cmd[0]);
> 
>  	result = scsi_execute(sdev, scsi_cmd, DMA_FROM_DEVICE, cmd_buf,
> -			      CMD_BUFSIZE, sense_buf, tout, CMD_RETRIES,
> -			      0, NULL);
> +			      CMD_BUFSIZE, sense_buf, to, CMD_RETRIES, 0, NULL);
> 
>  	if (driver_byte(result) == DRIVER_SENSE) {
>  		result &= ~(0xFF<<24); /* DRIVER_SENSE is not an error */
> @@ -1376,8 +1375,8 @@ out_attach:
>  	attach->block_size = gli->blk_len;
>  	attach->mmio_size = sizeof(afu->afu_map->hosts[0].harea);
>  	attach->last_lba = gli->max_lba;
> -	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT) /
> -		gli->blk_len;
> +	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT);
> +	attach->max_xfer /= gli->blk_len;

This change and the one above are not really part of the patch. Not a big deal, but
in future would be good to either call out the fact that there are a couple of unrelated
formatting changes, or keep them out and stick them in a separate cleanup patch.

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal
  2015-09-16 21:27 ` [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
@ 2015-09-18  1:26   ` Brian King
  2015-09-18 23:18     ` Matthew R. Ochs
  2015-09-21 12:11   ` Tomas Henzl
  1 sibling, 1 reply; 79+ messages in thread
From: Brian King @ 2015-09-18  1:26 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:27 PM, Matthew R. Ochs wrote:
> When a LUN is removed, the sdev that is associated with the LUN
> remains intact until its reference count drops to 0. In order
> to prevent an sdev from being removed while a context is still
> associated with it, obtain an additional reference per-context
> for each LUN attached to the context.
> 
> This resolves a potential Oops in the release handler when a
> dealing with a LUN that has already been removed.
> 
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/superpipe.c | 36 ++++++++++++++++++++++++------------
>  1 file changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index fa513ba..1fa4af6 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -880,6 +880,9 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev,
>  			sys_close(lfd);
>  	}
> 
> +	/* Release the sdev reference that bound this LUN to the context */
> +	scsi_device_put(sdev);
> +
>  out:
>  	if (put_ctx)
>  		put_context(ctxi);
> @@ -1287,11 +1290,18 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  			}
>  	}
> 
> +	rc = scsi_device_get(sdev);
> +	if (unlikely(rc)) {
> +		dev_err(dev, "%s: Unable to get sdev reference!\n", __func__);
> +		goto out;
> +	}
> +
>  	lun_access = kzalloc(sizeof(*lun_access), GFP_KERNEL);
>  	if (unlikely(!lun_access)) {
>  		dev_err(dev, "%s: Unable to allocate lun_access!\n", __func__);
> +		scsi_device_put(sdev);

Looks like you've got a double scsi_device_put in this path, since there is another put
in the the err0 path.

>  		rc = -ENOMEM;
> -		goto out;
> +		goto err0;
>  	}
> 
>  	lun_access->lli = lli;
> @@ -1311,21 +1321,21 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  		dev_err(dev, "%s: Could not initialize context %p\n",
>  			__func__, ctx);
>  		rc = -ENODEV;
> -		goto err0;
> +		goto err1;
>  	}
> 
>  	ctxid = cxl_process_element(ctx);
>  	if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) {
>  		dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid);
>  		rc = -EPERM;
> -		goto err1;
> +		goto err2;
>  	}
> 
>  	file = cxl_get_fd(ctx, &cfg->cxl_fops, &fd);
>  	if (unlikely(fd < 0)) {
>  		rc = -ENODEV;
>  		dev_err(dev, "%s: Could not get file descriptor\n", __func__);
> -		goto err1;
> +		goto err2;
>  	}
> 
>  	/* Translate read/write O_* flags from fcntl.h to AFU permission bits */
> @@ -1335,7 +1345,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  	if (unlikely(!ctxi)) {
>  		dev_err(dev, "%s: Failed to create context! (%d)\n",
>  			__func__, ctxid);
> -		goto err2;
> +		goto err3;
>  	}
> 
>  	work = &ctxi->work;
> @@ -1346,13 +1356,13 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  	if (unlikely(rc)) {
>  		dev_dbg(dev, "%s: Could not start context rc=%d\n",
>  			__func__, rc);
> -		goto err3;
> +		goto err4;
>  	}
> 
>  	rc = afu_attach(cfg, ctxi);
>  	if (unlikely(rc)) {
>  		dev_err(dev, "%s: Could not attach AFU rc %d\n", __func__, rc);
> -		goto err4;
> +		goto err5;
>  	}
> 
>  	/*
> @@ -1388,13 +1398,13 @@ out:
>  		__func__, ctxid, fd, attach->block_size, rc, attach->last_lba);
>  	return rc;
> 
> -err4:
> +err5:
>  	cxl_stop_context(ctx);
> -err3:
> +err4:
>  	put_context(ctxi);
>  	destroy_context(cfg, ctxi);
>  	ctxi = NULL;
> -err2:
> +err3:
>  	/*
>  	 * Here, we're overriding the fops with a dummy all-NULL fops because
>  	 * fput() calls the release fop, which will cause us to mistakenly
> @@ -1406,10 +1416,12 @@ err2:
>  	fput(file);
>  	put_unused_fd(fd);
>  	fd = -1;
> -err1:
> +err2:
>  	cxl_release_context(ctx);
> -err0:
> +err1:
>  	kfree(lun_access);
> +err0:
> +	scsi_device_put(sdev);
>  	goto out;
>  }
> 


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards
  2015-09-16 21:27 ` [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards Matthew R. Ochs
@ 2015-09-18  1:28   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:28 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool)
  2015-09-16 21:27 ` [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool) Matthew R. Ochs
@ 2015-09-18  1:29   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:29 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 07/30] cxlflash: Fix context encode mask width
  2015-09-16 21:27 ` [PATCH v2 07/30] cxlflash: Fix context encode mask width Matthew R. Ochs
@ 2015-09-18  1:29   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18  1:29 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH
  2015-09-16 21:27 ` [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH Matthew R. Ochs
@ 2015-09-18 13:37   ` Brian King
  2015-09-18 23:54     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Brian King @ 2015-09-18 13:37 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:27 PM, Matthew R. Ochs wrote:
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -2311,6 +2311,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
>  	cfg->lr_port = -1;
>  	mutex_init(&cfg->ctx_tbl_list_mutex);
>  	mutex_init(&cfg->ctx_recovery_mutex);
> +	init_rwsem(&cfg->ioctl_rwsem);
>  	INIT_LIST_HEAD(&cfg->ctx_err_recovery);
>  	INIT_LIST_HEAD(&cfg->lluns);
> 
> @@ -2365,6 +2366,19 @@ out_remove:
>  }
> 
>  /**
> + * drain_ioctls() - wait until all currently executing ioctls have completed
> + * @cfg:	Internal structure associated with the host.
> + *
> + * Obtain write access to read/write semaphore that wraps ioctl
> + * handling to 'drain' ioctls currently executing.
> + */
> +static void drain_ioctls(struct cxlflash_cfg *cfg)
> +{
> +	down_write(&cfg->ioctl_rwsem);
> +	up_write(&cfg->ioctl_rwsem);
> +}
> +
> +/**
>   * cxlflash_pci_error_detected() - called when a PCI error is detected
>   * @pdev:	PCI device struct.
>   * @state:	PCI channel state.
> @@ -2383,16 +2397,14 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
>  	switch (state) {
>  	case pci_channel_io_frozen:
>  		cfg->state = STATE_LIMBO;
> -
> -		/* Turn off legacy I/O */
>  		scsi_block_requests(cfg->host);
> +		drain_ioctls(cfg);

So, what kicks any outstanding ioctls back? Let's assume you are in the middle of disk_attach
and you've sent the READ_CAP16 to the device. It appears as if what would happen here is we'd
sit here in cxlflash_pci_error_detected. Eventually, the READ_CAP16 would timeout. This would
wake the SCSI error handler, and end up calling your eh_device_reset handler, which would see that
we are in STATE_LIMBO, where it would then do a wait_event, waiting for us to get out of STATE_LIMBO,
and we would end up in a deadlock.

Rather than implementing a rw semaphore, would it be better to simply make the ioctls check the
state we are in and either wait to get out of EEH state or fail themselves?

>  		rc = cxlflash_mark_contexts_error(cfg);
>  		if (unlikely(rc))
>  			dev_err(dev, "%s: Failed to mark user contexts!(%d)\n",
>  				__func__, rc);
>  		term_mc(cfg, UNDO_START);
>  		stop_afu(cfg);
> -
>  		return PCI_ERS_RESULT_NEED_RESET;
>  	case pci_channel_io_perm_failure:
>  		cfg->state = STATE_FAILTERM;
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index cf2a85d..8a18230 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -1214,6 +1214,48 @@ static const struct file_operations null_fops = {
>  };
> 
>  /**
> + * check_state() - checks and responds to the current adapter state
> + * @cfg:	Internal structure associated with the host.
> + * @ioctl:	Indicates if on an ioctl thread.
> + *
> + * This routine can block and should only be used on process context.
> + * When blocking on an ioctl thread, the ioctl read semaphore should be
> + * let up to allow for draining actively running ioctls. Also note that
> + * when waking up from waiting in reset, the state is unknown and must
> + * be checked again before proceeding.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +static int check_state(struct cxlflash_cfg *cfg, bool ioctl)

All your callers appear to set the second parameter to true, so why bother having it?

> +{
> +	struct device *dev = &cfg->dev->dev;
> +	int rc = 0;
> +
> +retry:
> +	switch (cfg->state) {
> +	case STATE_LIMBO:
> +		dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__);
> +		if (ioctl)
> +			up_read(&cfg->ioctl_rwsem);
> +		rc = wait_event_interruptible(cfg->limbo_waitq,
> +					      cfg->state != STATE_LIMBO);
> +		if (ioctl)
> +			down_read(&cfg->ioctl_rwsem);
> +		if (unlikely(rc))
> +			break;
> +		goto retry;
> +	case STATE_FAILTERM:
> +		dev_dbg(dev, "%s: Failed/Terminating!\n", __func__);
> +		rc = -ENODEV;
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return rc;
> +}
> +
> +/**
>   * cxlflash_disk_attach() - attach a LUN to a context
>   * @sdev:	SCSI device associated with LUN.
>   * @attach:	Attach ioctl data structure.

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq
  2015-09-16 21:28 ` [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq Matthew R. Ochs
@ 2015-09-18 15:28   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18 15:28 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 11/30] cxlflash: Make functions static
  2015-09-16 21:28 ` [PATCH v2 11/30] cxlflash: Make functions static Matthew R. Ochs
@ 2015-09-18 15:34   ` Brian King
  2015-09-21 12:18   ` Tomas Henzl
  1 sibling, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18 15:34 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 12/30] cxlflash: Refine host/device attributes
  2015-09-16 21:29 ` [PATCH v2 12/30] cxlflash: Refine host/device attributes Matthew R. Ochs
@ 2015-09-18 21:34   ` Brian King
  2015-09-18 23:56     ` Matthew R. Ochs
  2015-09-21  9:55     ` David Laight
  0 siblings, 2 replies; 79+ messages in thread
From: Brian King @ 2015-09-18 21:34 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan, Shane Seymour
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:29 PM, Matthew R. Ochs wrote:
> Implement the following suggestions and add two new attributes
> to allow for debugging the port LUN table.
> 
>  - use scnprintf() instead of snprintf()
>  - use DEVICE_ATTR_RO and DEVICE_ATTR_RW
> 
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Suggested-by: Shane Seymour <shane.seymour@hp.com>
> ---
>  drivers/scsi/cxlflash/main.c | 180 +++++++++++++++++++++++++++++++++----------
>  1 file changed, 138 insertions(+), 42 deletions(-)
> 

>  /**
> - * cxlflash_show_dev_mode() - presents the current mode of the device
> + * cxlflash_show_port_lun_table() - queries and presents the port LUN table
> + * @port:	Desired port for status reporting.
> + * @afu:	AFU owning the specified port.
> + * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_show_port_lun_table(u32 port,
> +					    struct afu *afu,
> +					    char *buf)
> +{
> +	int i;
> +	ssize_t bytes = 0;
> +	__be64 __iomem *fc_port;
> +
> +	if (port >= NUM_FC_PORTS)
> +		return 0;
> +
> +	fc_port = &afu->afu_map->global.fc_port[port][0];
> +
> +	for (i = 0; i < CXLFLASH_NUM_VLUNS; i++, buf += 22)

Rather than this bug prone hard coded 22, how about never incrementing buf and do something
similar to this:

> +		bytes += scnprintf(buf, PAGE_SIZE, "%03d: %016llX\n",
> +				   i, readq_be(&fc_port[i]));

		bytes += scnprintf(&buf[bytes], PAGE_SIZE, "%03d: %016llX\n",
				   i, readq_be(&fc_port[i]));

> +	return bytes;
> +}
> +



-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log
  2015-09-16 21:30 ` [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log Matthew R. Ochs
@ 2015-09-18 21:39   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-18 21:39 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal
  2015-09-18  1:26   ` Brian King
@ 2015-09-18 23:18     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-18 23:18 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

> On Sep 17, 2015, at 8:26 PM, Brian King <brking@linux.vnet.ibm.com> =
wrote:
>=20
> On 09/16/2015 04:27 PM, Matthew R. Ochs wrote:
>>=20
>> 	lun_access =3D kzalloc(sizeof(*lun_access), GFP_KERNEL);
>> 	if (unlikely(!lun_access)) {
>> 		dev_err(dev, "%s: Unable to allocate lun_access!\n", =
__func__);
>> +		scsi_device_put(sdev);
>=20
> Looks like you've got a double scsi_device_put in this path, since =
there is another put
> in the the err0 path.

Good catch! I'll fix in v3.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH
  2015-09-18 13:37   ` Brian King
@ 2015-09-18 23:54     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-18 23:54 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

> On Sep 18, 2015, at 8:37 AM, Brian King <brking@linux.vnet.ibm.com> =
wrote:
> On 09/16/2015 04:27 PM, Matthew R. Ochs wrote:
>>=20
>> /**
>> + * drain_ioctls() - wait until all currently executing ioctls have =
completed
>> + * @cfg:	Internal structure associated with the host.
>> + *
>> + * Obtain write access to read/write semaphore that wraps ioctl
>> + * handling to 'drain' ioctls currently executing.
>> + */
>> +static void drain_ioctls(struct cxlflash_cfg *cfg)
>> +{
>> +	down_write(&cfg->ioctl_rwsem);
>> +	up_write(&cfg->ioctl_rwsem);
>> +}
>> +
>> +/**
>>  * cxlflash_pci_error_detected() - called when a PCI error is =
detected
>>  * @pdev:	PCI device struct.
>>  * @state:	PCI channel state.
>> @@ -2383,16 +2397,14 @@ static pci_ers_result_t =
cxlflash_pci_error_detected(struct pci_dev *pdev,
>> 	switch (state) {
>> 	case pci_channel_io_frozen:
>> 		cfg->state =3D STATE_LIMBO;
>> -
>> -		/* Turn off legacy I/O */
>> 		scsi_block_requests(cfg->host);
>> +		drain_ioctls(cfg);
>=20
> So, what kicks any outstanding ioctls back? Let's assume you are in =
the middle of disk_attach
> and you've sent the READ_CAP16 to the device. It appears as if what =
would happen here is we'd
> sit here in cxlflash_pci_error_detected. Eventually, the READ_CAP16 =
would timeout. This would
> wake the SCSI error handler, and end up calling your eh_device_reset =
handler, which would see that
> we are in STATE_LIMBO, where it would then do a wait_event, waiting =
for us to get out of STATE_LIMBO,
> and we would end up in a deadlock.
>=20
> Rather than implementing a rw semaphore, would it be better to simply =
make the ioctls check the
> state we are in and either wait to get out of EEH state or fail =
themselves?

We do have the ioctls check the state and wait for EEH to complete
or fail completely in the event that the device is terminating (see
ioctl_common()). The drain exists to create a wait point for ioctls that
have already passed the state check and are active. The CXL services
cannot be called during the recovery window (maybe this requirement
will go away in a future release?), thus the reason for this 'drain'.

To handle it I considered 3 options:

 - add state check wraps to all CXL service calls
 - create a "running ioctls" count that could be evaluated
 - wrap the ioctl in read semaphore and obtain write access to 'wait'

I started with the first option and it quickly made the code very nasty. =
I then
began implementing the second option and as I was writing the code to =
wrap
the ioctl with increment/decrement statements, the third option entered =
my
mind and seemed like a much cleaner solution. Therefore I went with that
approach and did not look back.

With regard to your example, you bring up a good point and we'll need to
do something about that. One thought that comes to mind would be for us
to drop the semaphore before making this type of call (I believe there =
are
only 2 places like this), reacquiring it when we return, and then =
checking
the state to make sure we're not in a reset situation.

>=20
>> 		rc =3D cxlflash_mark_contexts_error(cfg);
>> 		if (unlikely(rc))
>> 			dev_err(dev, "%s: Failed to mark user =
contexts!(%d)\n",
>> 				__func__, rc);
>> 		term_mc(cfg, UNDO_START);
>> 		stop_afu(cfg);
>> -
>> 		return PCI_ERS_RESULT_NEED_RESET;
>> 	case pci_channel_io_perm_failure:
>> 		cfg->state =3D STATE_FAILTERM;
>> diff --git a/drivers/scsi/cxlflash/superpipe.c =
b/drivers/scsi/cxlflash/superpipe.c
>> index cf2a85d..8a18230 100644
>> --- a/drivers/scsi/cxlflash/superpipe.c
>> +++ b/drivers/scsi/cxlflash/superpipe.c
>> @@ -1214,6 +1214,48 @@ static const struct file_operations null_fops =
=3D {
>> };
>>=20
>> /**
>> + * check_state() - checks and responds to the current adapter state
>> + * @cfg:	Internal structure associated with the host.
>> + * @ioctl:	Indicates if on an ioctl thread.
>> + *
>> + * This routine can block and should only be used on process =
context.
>> + * When blocking on an ioctl thread, the ioctl read semaphore should =
be
>> + * let up to allow for draining actively running ioctls. Also note =
that
>> + * when waking up from waiting in reset, the state is unknown and =
must
>> + * be checked again before proceeding.
>> + *
>> + * Return: 0 on success, -errno on failure
>> + */
>> +static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
>=20
> All your callers appear to set the second parameter to true, so why =
bother having it?

That's a good point. I originally had a case where there was a need for =
this
but have since removed it. I can fix that in v3.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 12/30] cxlflash: Refine host/device attributes
  2015-09-18 21:34   ` Brian King
@ 2015-09-18 23:56     ` Matthew R. Ochs
  2015-09-21  9:55     ` David Laight
  1 sibling, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-18 23:56 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Shane Seymour, Michael Neuling,
	linuxppc-dev, Manoj N. Kumar

> On Sep 18, 2015, at 4:34 PM, Brian King <brking@linux.vnet.ibm.com> =
wrote:
> On 09/16/2015 04:29 PM, Matthew R. Ochs wrote:
>>=20
>> +	ssize_t bytes =3D 0;
>> +	__be64 __iomem *fc_port;
>> +
>> +	if (port >=3D NUM_FC_PORTS)
>> +		return 0;
>> +
>> +	fc_port =3D &afu->afu_map->global.fc_port[port][0];
>> +
>> +	for (i =3D 0; i < CXLFLASH_NUM_VLUNS; i++, buf +=3D 22)
>=20
> Rather than this bug prone hard coded 22, how about never incrementing =
buf and do something
> similar to this:
>=20
>> +		bytes +=3D scnprintf(buf, PAGE_SIZE, "%03d: %016llX\n",
>> +				   i, readq_be(&fc_port[i]));
>=20
> 		bytes +=3D scnprintf(&buf[bytes], PAGE_SIZE, "%03d: =
%016llX\n",
> 				   i, readq_be(&fc_port[i]));
>=20
>> +	return bytes;
>> +}
>> +

Great suggestion! Will fix in v3.


-matt=

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH v2 12/30] cxlflash: Refine host/device attributes
  2015-09-18 21:34   ` Brian King
  2015-09-18 23:56     ` Matthew R. Ochs
@ 2015-09-21  9:55     ` David Laight
  1 sibling, 0 replies; 79+ messages in thread
From: David Laight @ 2015-09-21  9:55 UTC (permalink / raw)
  To: 'Brian King',
	Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan, Shane Seymour
  Cc: Michael Neuling, Manoj N. Kumar, linuxppc-dev

RnJvbTogQnJpYW4gS2luZw0KPiBTZW50OiAxOCBTZXB0ZW1iZXIgMjAxNSAyMjozNQ0KLi4uDQo+
ID4gKwlmb3IgKGkgPSAwOyBpIDwgQ1hMRkxBU0hfTlVNX1ZMVU5TOyBpKyssIGJ1ZiArPSAyMikN
Cj4gDQo+IFJhdGhlciB0aGFuIHRoaXMgYnVnIHByb25lIGhhcmQgY29kZWQgMjIsIGhvdyBhYm91
dCBuZXZlciBpbmNyZW1lbnRpbmcgYnVmIGFuZCBkbyBzb21ldGhpbmcNCj4gc2ltaWxhciB0byB0
aGlzOg0KPiANCj4gPiArCQlieXRlcyArPSBzY25wcmludGYoYnVmLCBQQUdFX1NJWkUsICIlMDNk
OiAlMDE2bGxYXG4iLA0KPiA+ICsJCQkJICAgaSwgcmVhZHFfYmUoJmZjX3BvcnRbaV0pKTsNCj4g
DQo+IAkJYnl0ZXMgKz0gc2NucHJpbnRmKCZidWZbYnl0ZXNdLCBQQUdFX1NJWkUsICIlMDNkOiAl
MDE2bGxYXG4iLA0KPiAJCQkJICAgaSwgcmVhZHFfYmUoJmZjX3BvcnRbaV0pKTsNCi4uLg0KDQoJ
Ynl0ZXMgKz0gc2NucHJpbnRmKGJ1ZiArIGJ5dGVzLCBQQUdFX1NJWkUgLSBieXRlcywgLi4uLg0K
DQpZb3UgbmVlZCB0byBjaGVjayBzY25wcmludGYoKSdzIHJldHVybiB2YWx1ZSB0aG91Z2guDQpU
aGUgYWJvdmUgaXMgd3JvbmcgZm9yIGxpYmMncyBzbnByaW50ZigpLg0KDQoJRGF2aWQNCg0K

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 03/30] cxlflash: Fix read capacity timeout
  2015-09-16 21:26 ` [PATCH v2 03/30] cxlflash: Fix read capacity timeout Matthew R. Ochs
  2015-09-18  1:21   ` Brian King
@ 2015-09-21 11:36   ` Tomas Henzl
  2015-09-21 22:11     ` Matthew R. Ochs
  1 sibling, 1 reply; 79+ messages in thread
From: Tomas Henzl @ 2015-09-21 11:36 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Brian King, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj Kumar, Manoj N. Kumar

On 16.9.2015 23:26, Matthew R. Ochs wrote:
> From: Manoj Kumar <kumarmn@us.ibm.com>
>
> The timeout value for read capacity is too small. Certain devices
> may take longer to respond and thus the command may prematurely
> timeout. Additionally the literal used for the timeout is stale.
>
> Update the timeout to 30 seconds (matches the value used in sd.c)
> and rework the timeout literal to a more appropriate description.
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/superpipe.c | 9 ++++-----
>  drivers/scsi/cxlflash/superpipe.h | 2 +-
>  drivers/scsi/cxlflash/vlun.c      | 4 ++--
>  3 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index 7df985d..fa513ba 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -296,7 +296,7 @@ static int read_cap16(struct scsi_device *sdev, struct llun_info *lli)
>  	int rc = 0;
>  	int result = 0;
>  	int retry_cnt = 0;
> -	u32 tout = (MC_DISCOVERY_TIMEOUT * HZ);
> +	u32 to = (CMD_TIMEOUT * HZ);

In V3 please remove the parenthesis here^ 

>  
>  retry:
>  	cmd_buf = kzalloc(CMD_BUFSIZE, GFP_KERNEL);
...

> @@ -1376,8 +1375,8 @@ out_attach:
>  	attach->block_size = gli->blk_len;
>  	attach->mmio_size = sizeof(afu->afu_map->hosts[0].harea);
>  	attach->last_lba = gli->max_lba;
> -	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT) /
> -		gli->blk_len;
> +	attach->max_xfer = (sdev->host->max_sectors * MAX_SECTOR_UNIT);

and here^ too.

Thanks,
Tomas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal
  2015-09-16 21:27 ` [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
  2015-09-18  1:26   ` Brian King
@ 2015-09-21 12:11   ` Tomas Henzl
  2015-09-21 22:32     ` Matthew R. Ochs
  1 sibling, 1 reply; 79+ messages in thread
From: Tomas Henzl @ 2015-09-21 12:11 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Brian King, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 16.9.2015 23:27, Matthew R. Ochs wrote:
> When a LUN is removed, the sdev that is associated with the LUN
> remains intact until its reference count drops to 0. In order
> to prevent an sdev from being removed while a context is still
> associated with it, obtain an additional reference per-context
> for each LUN attached to the context.
>
> This resolves a potential Oops in the release handler when a
> dealing with a LUN that has already been removed.
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/superpipe.c | 36 ++++++++++++++++++++++++------------
>  1 file changed, 24 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index fa513ba..1fa4af6 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -880,6 +880,9 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev,
>  			sys_close(lfd);
>  	}
>  
> +	/* Release the sdev reference that bound this LUN to the context */
> +	scsi_device_put(sdev);
> +

I'm not sure here with the use if scsi_device_get+put, also I don't quite well
understand what you are going to fix here and how can it happen.
The scsi_device_get takes an additional module reference, so if used from
a module it shouldn't be held for a long time.
Is it possible for a user to rmmod the czlflash module
after the disk attach function is called?

Cheers,
--tm

>  out:
>  	if (put_ctx)
>  		put_context(ctxi);
> @@ -1287,11 +1290,18 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  			}
>  	}
>  
> +	rc = scsi_device_get(sdev);
> +	if (unlikely(rc)) {
> +		dev_err(dev, "%s: Unable to get sdev reference!\n", __func__);
> +		goto out;
> +	}
> +
>  	lun_access = kzalloc(sizeof(*lun_access), GFP_KERNEL);
>  	if (unlikely(!lun_access)) {
>  		dev_err(dev, "%s: Unable to allocate lun_access!\n", __func__);
> +		scsi_device_put(sdev);
>  		rc = -ENOMEM;
> -		goto out;
> +		goto err0;
>  	}
>  
>  	lun_access->lli = lli;
> @@ -1311,21 +1321,21 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  		dev_err(dev, "%s: Could not initialize context %p\n",
>  			__func__, ctx);
>  		rc = -ENODEV;
> -		goto err0;
> +		goto err1;
>  	}
>  
>  	ctxid = cxl_process_element(ctx);
>  	if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) {
>  		dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid);
>  		rc = -EPERM;
> -		goto err1;
> +		goto err2;
>  	}
>  
>  	file = cxl_get_fd(ctx, &cfg->cxl_fops, &fd);
>  	if (unlikely(fd < 0)) {
>  		rc = -ENODEV;
>  		dev_err(dev, "%s: Could not get file descriptor\n", __func__);
> -		goto err1;
> +		goto err2;
>  	}
>  
>  	/* Translate read/write O_* flags from fcntl.h to AFU permission bits */
> @@ -1335,7 +1345,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  	if (unlikely(!ctxi)) {
>  		dev_err(dev, "%s: Failed to create context! (%d)\n",
>  			__func__, ctxid);
> -		goto err2;
> +		goto err3;
>  	}
>  
>  	work = &ctxi->work;
> @@ -1346,13 +1356,13 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
>  	if (unlikely(rc)) {
>  		dev_dbg(dev, "%s: Could not start context rc=%d\n",
>  			__func__, rc);
> -		goto err3;
> +		goto err4;
>  	}
>  
>  	rc = afu_attach(cfg, ctxi);
>  	if (unlikely(rc)) {
>  		dev_err(dev, "%s: Could not attach AFU rc %d\n", __func__, rc);
> -		goto err4;
> +		goto err5;
>  	}
>  
>  	/*
> @@ -1388,13 +1398,13 @@ out:
>  		__func__, ctxid, fd, attach->block_size, rc, attach->last_lba);
>  	return rc;
>  
> -err4:
> +err5:
>  	cxl_stop_context(ctx);
> -err3:
> +err4:
>  	put_context(ctxi);
>  	destroy_context(cfg, ctxi);
>  	ctxi = NULL;
> -err2:
> +err3:
>  	/*
>  	 * Here, we're overriding the fops with a dummy all-NULL fops because
>  	 * fput() calls the release fop, which will cause us to mistakenly
> @@ -1406,10 +1416,12 @@ err2:
>  	fput(file);
>  	put_unused_fd(fd);
>  	fd = -1;
> -err1:
> +err2:
>  	cxl_release_context(ctx);
> -err0:
> +err1:
>  	kfree(lun_access);
> +err0:
> +	scsi_device_put(sdev);
>  	goto out;
>  }
>  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 11/30] cxlflash: Make functions static
  2015-09-16 21:28 ` [PATCH v2 11/30] cxlflash: Make functions static Matthew R. Ochs
  2015-09-18 15:34   ` Brian King
@ 2015-09-21 12:18   ` Tomas Henzl
  2015-09-21 22:36     ` Matthew R. Ochs
  1 sibling, 1 reply; 79+ messages in thread
From: Tomas Henzl @ 2015-09-21 12:18 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Brian King, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 16.9.2015 23:28, Matthew R. Ochs wrote:
> Found during code inspection, that the following functions are not
> being used outside of the file where they are defined. Make them static.
>
> int cxlflash_send_cmd(struct afu *, struct afu_cmd *);
> void cxlflash_wait_resp(struct afu *, struct afu_cmd *);
> int cxlflash_afu_reset(struct cxlflash_cfg *);
> struct afu_cmd *cxlflash_cmd_checkout(struct afu *);
> void cxlflash_cmd_checkin(struct afu_cmd *);
> void init_pcr(struct cxlflash_cfg *);
> int init_global(struct cxlflash_cfg *);
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/common.h |    5 -
>  drivers/scsi/cxlflash/main.c   | 1018 ++++++++++++++++++++--------------------
>  2 files changed, 509 insertions(+), 514 deletions(-)
>
> diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
> index 6e0be53..2855b09 100644
> --- a/drivers/scsi/cxlflash/common.h
> +++ b/drivers/scsi/cxlflash/common.h
> @@ -194,11 +194,6 @@ static inline u64 lun_to_lunid(u64 lun)
>  	return swab64(lun_id);
>  }
>  
> -int cxlflash_send_cmd(struct afu *, struct afu_cmd *);
> -void cxlflash_wait_resp(struct afu *, struct afu_cmd *);
> -int cxlflash_afu_reset(struct cxlflash_cfg *);
> -struct afu_cmd *cxlflash_cmd_checkout(struct afu *);
> -void cxlflash_cmd_checkin(struct afu_cmd *);
>  int cxlflash_afu_sync(struct afu *, ctx_hndl_t, res_hndl_t, u8);
>  void cxlflash_list_init(void);
>  void cxlflash_term_global_luns(void);
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 01b7f3e..f2f41a7 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -36,7 +36,7 @@ MODULE_LICENSE("GPL");
>  
>  
>  /**
> - * cxlflash_cmd_checkout() - checks out an AFU command
> + * cmd_checkout() - checks out an AFU command
>   * @afu:	AFU to checkout from.
>   *
>   * Commands are checked out in a round-robin fashion. Note that since
> @@ -47,7 +47,7 @@ MODULE_LICENSE("GPL");
>   *
>   * Return: The checked out command or NULL when command pool is empty.
>   */
> -struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
> +static struct afu_cmd *cmd_checkout(struct afu *afu)
>  {
>  	int k, dec = CXLFLASH_NUM_CMDS;
>  	struct afu_cmd *cmd;
> @@ -70,7 +70,7 @@ struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
>  }
>  
>  /**
> - * cxlflash_cmd_checkin() - checks in an AFU command
> + * cmd_checkin() - checks in an AFU command
>   * @cmd:	AFU command to checkin.
>   *
>   * Safe to pass commands that have already been checked in. Several
> @@ -79,7 +79,7 @@ struct afu_cmd *cxlflash_cmd_checkout(struct afu *afu)
>   * to avoid clobbering values in the event that the command is checked
>   * out right away.
>   */
> -void cxlflash_cmd_checkin(struct afu_cmd *cmd)
> +static void cmd_checkin(struct afu_cmd *cmd)
>  {
>  	cmd->rcb.scp = NULL;
>  	cmd->rcb.timeout = 0;
> @@ -238,7 +238,7 @@ static void cmd_complete(struct afu_cmd *cmd)
>  
>  		resid = cmd->sa.resid;
>  		cmd_is_tmf = cmd->cmd_tmf;
> -		cxlflash_cmd_checkin(cmd); /* Don't use cmd after here */
> +		cmd_checkin(cmd); /* Don't use cmd after here */
>  
>  		pr_debug("%s: calling scsi_set_resid, scp=%p "
>  			 "result=%X resid=%d\n", __func__,
> @@ -260,6 +260,146 @@ static void cmd_complete(struct afu_cmd *cmd)
>  }
>  
>  /**
> + * context_reset() - timeout handler for AFU commands
> + * @cmd:	AFU command that timed out.
> + *
> + * Sends a reset to the AFU.
> + */
> +static void context_reset(struct afu_cmd *cmd)
> +{
> +	int nretry = 0;
> +	u64 rrin = 0x1;
> +	u64 room = 0;
> +	struct afu *afu = cmd->parent;
> +	ulong lock_flags;
> +
> +	pr_debug("%s: cmd=%p\n", __func__, cmd);
> +
> +	spin_lock_irqsave(&cmd->slock, lock_flags);
> +
> +	/* Already completed? */
> +	if (cmd->sa.host_use_b[0] & B_DONE) {
> +		spin_unlock_irqrestore(&cmd->slock, lock_flags);
> +		return;
> +	}
> +
> +	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
> +	spin_unlock_irqrestore(&cmd->slock, lock_flags);
> +
> +	/*
> +	 * We really want to send this reset at all costs, so spread
> +	 * out wait time on successive retries for available room.
> +	 */
> +	do {
> +		room = readq_be(&afu->host_map->cmd_room);
> +		atomic64_set(&afu->room, room);
> +		if (room)
> +			goto write_rrin;
> +		udelay(nretry);
> +	} while (nretry++ < MC_ROOM_RETRY_CNT);
> +
> +	pr_err("%s: no cmd_room to send reset\n", __func__);
> +	return;
> +
> +write_rrin:
> +	nretry = 0;
> +	writeq_be(rrin, &afu->host_map->ioarrin);
> +	do {
> +		rrin = readq_be(&afu->host_map->ioarrin);
> +		if (rrin != 0x1)
> +			break;
> +		/* Double delay each time */
> +		udelay(2 ^ nretry);

Double delay - isn't another operator needed?
If so, pleas add a new patch for this.
--tm

> +	} while (nretry++ < MC_ROOM_RETRY_CNT);
> +}
> +
> +/**
> + * send_cmd() - sends an AFU command
> + * @afu:	AFU associated with the host.
> + * @cmd:	AFU command to send.
> + *
> + * Return:
> + *	0 on success or SCSI_MLQUEUE_HOST_BUSY
> + */
> +static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
> +{
> +	struct cxlflash_cfg *cfg = afu->parent;
> +	struct device *dev = &cfg->dev->dev;
> +	int nretry = 0;
> +	int rc = 0;
> +	u64 room;
> +	long newval;
> +
> +	/*
> +	 * This routine is used by critical users such an AFU sync and to
> +	 * send a task management function (TMF). Thus we want to retry a
> +	 * bit before returning an error. To avoid the performance penalty
> +	 * of MMIO, we spread the update of 'room' over multiple commands.
> +	 */
> +retry:
> +	newval = atomic64_dec_if_positive(&afu->room);
> +	if (!newval) {
> +		do {
> +			room = readq_be(&afu->host_map->cmd_room);
> +			atomic64_set(&afu->room, room);
> +			if (room)
> +				goto write_ioarrin;
> +			udelay(nretry);
> +		} while (nretry++ < MC_ROOM_RETRY_CNT);
> +
> +		dev_err(dev, "%s: no cmd_room to send 0x%X\n",
> +		       __func__, cmd->rcb.cdb[0]);
> +
> +		goto no_room;
> +	} else if (unlikely(newval < 0)) {
> +		/* This should be rare. i.e. Only if two threads race and
> +		 * decrement before the MMIO read is done. In this case
> +		 * just benefit from the other thread having updated
> +		 * afu->room.
> +		 */
> +		if (nretry++ < MC_ROOM_RETRY_CNT) {
> +			udelay(nretry);
> +			goto retry;
> +		}
> +
> +		goto no_room;
> +	}
> +
> +write_ioarrin:
> +	writeq_be((u64)&cmd->rcb, &afu->host_map->ioarrin);
> +out:
> +	pr_devel("%s: cmd=%p len=%d ea=%p rc=%d\n", __func__, cmd,
> +		 cmd->rcb.data_len, (void *)cmd->rcb.data_ea, rc);
> +	return rc;
> +
> +no_room:
> +	afu->read_room = true;
> +	schedule_work(&cfg->work_q);
> +	rc = SCSI_MLQUEUE_HOST_BUSY;
> +	goto out;
> +}
> +
> +/**
> + * wait_resp() - polls for a response or timeout to a sent AFU command
> + * @afu:	AFU associated with the host.
> + * @cmd:	AFU command that was sent.
> + */
> +static void wait_resp(struct afu *afu, struct afu_cmd *cmd)
> +{
> +	ulong timeout = msecs_to_jiffies(cmd->rcb.timeout * 2 * 1000);
> +
> +	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
> +	if (!timeout)
> +		context_reset(cmd);
> +
> +	if (unlikely(cmd->sa.ioasc != 0))
> +		pr_err("%s: CMD 0x%X failed, IOASC: flags 0x%X, afu_rc 0x%X, "
> +		       "scsi_rc 0x%X, fc_rc 0x%X\n", __func__, cmd->rcb.cdb[0],
> +		       cmd->sa.rc.flags, cmd->sa.rc.afu_rc, cmd->sa.rc.scsi_rc,
> +		       cmd->sa.rc.fc_rc);
> +}
> +
> +/**
>   * send_tmf() - sends a Task Management Function (TMF)
>   * @afu:	AFU to checkout from.
>   * @scp:	SCSI command from stack.
> @@ -280,7 +420,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
>  	ulong lock_flags;
>  	int rc = 0;
>  
> -	cmd = cxlflash_cmd_checkout(afu);
> +	cmd = cmd_checkout(afu);
>  	if (unlikely(!cmd)) {
>  		pr_err("%s: could not get a free command\n", __func__);
>  		rc = SCSI_MLQUEUE_HOST_BUSY;
> @@ -313,9 +453,9 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
>  	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
>  
>  	/* Send the command */
> -	rc = cxlflash_send_cmd(afu, cmd);
> +	rc = send_cmd(afu, cmd);
>  	if (unlikely(rc)) {
> -		cxlflash_cmd_checkin(cmd);
> +		cmd_checkin(cmd);
>  		spin_lock_irqsave(&cfg->tmf_waitq.lock, lock_flags);
>  		cfg->tmf_active = false;
>  		spin_unlock_irqrestore(&cfg->tmf_waitq.lock, lock_flags);
> @@ -398,7 +538,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
>  		break;
>  	}
>  
> -	cmd = cxlflash_cmd_checkout(afu);
> +	cmd = cmd_checkout(afu);
>  	if (unlikely(!cmd)) {
>  		pr_err("%s: could not get a free command\n", __func__);
>  		rc = SCSI_MLQUEUE_HOST_BUSY;
> @@ -438,9 +578,9 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
>  	memcpy(cmd->rcb.cdb, scp->cmnd, sizeof(cmd->rcb.cdb));
>  
>  	/* Send the command */
> -	rc = cxlflash_send_cmd(afu, cmd);
> +	rc = send_cmd(afu, cmd);
>  	if (unlikely(rc)) {
> -		cxlflash_cmd_checkin(cmd);
> +		cmd_checkin(cmd);
>  		scsi_dma_unmap(scp);
>  	}
>  
> @@ -449,369 +589,55 @@ out:
>  }
>  
>  /**
> - * cxlflash_eh_device_reset_handler() - reset a single LUN
> - * @scp:	SCSI command to send.
> - *
> - * Return:
> - *	SUCCESS as defined in scsi/scsi.h
> - *	FAILED as defined in scsi/scsi.h
> + * cxlflash_wait_for_pci_err_recovery() - wait for error recovery during probe
> + * @cxlflash:	Internal structure associated with the host.
>   */
> -static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
> +static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
>  {
> -	int rc = SUCCESS;
> -	struct Scsi_Host *host = scp->device->host;
> -	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
> -	struct afu *afu = cfg->afu;
> -	int rcr = 0;
> -
> -	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
> -		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
> -		 host->host_no, scp->device->channel,
> -		 scp->device->id, scp->device->lun,
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
> -
> -	switch (cfg->state) {
> -	case STATE_NORMAL:
> -		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
> -		if (unlikely(rcr))
> -			rc = FAILED;
> -		break;
> -	case STATE_RESET:
> -		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
> -		if (cfg->state == STATE_NORMAL)
> -			break;
> -		/* fall through */
> -	default:
> -		rc = FAILED;
> -		break;
> -	}
> +	struct pci_dev *pdev = cfg->dev;
>  
> -	pr_debug("%s: returning rc=%d\n", __func__, rc);
> -	return rc;
> +	if (pci_channel_offline(pdev))
> +		wait_event_timeout(cfg->reset_waitq,
> +				   !pci_channel_offline(pdev),
> +				   CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT);
>  }
>  
>  /**
> - * cxlflash_eh_host_reset_handler() - reset the host adapter
> - * @scp:	SCSI command from stack identifying host.
> - *
> - * Return:
> - *	SUCCESS as defined in scsi/scsi.h
> - *	FAILED as defined in scsi/scsi.h
> + * free_mem() - free memory associated with the AFU
> + * @cxlflash:	Internal structure associated with the host.
>   */
> -static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
> +static void free_mem(struct cxlflash_cfg *cfg)
>  {
> -	int rc = SUCCESS;
> -	int rcr = 0;
> -	struct Scsi_Host *host = scp->device->host;
> -	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
> +	int i;
> +	char *buf = NULL;
> +	struct afu *afu = cfg->afu;
>  
> -	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
> -		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
> -		 host->host_no, scp->device->channel,
> -		 scp->device->id, scp->device->lun,
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
> -		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
> +	if (cfg->afu) {
> +		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
> +			buf = afu->cmd[i].buf;
> +			if (!((u64)buf & (PAGE_SIZE - 1)))
> +				free_page((ulong)buf);
> +		}
>  
> -	switch (cfg->state) {
> -	case STATE_NORMAL:
> -		cfg->state = STATE_RESET;
> -		scsi_block_requests(cfg->host);
> -		cxlflash_mark_contexts_error(cfg);
> -		rcr = cxlflash_afu_reset(cfg);
> -		if (rcr) {
> -			rc = FAILED;
> -			cfg->state = STATE_FAILTERM;
> -		} else
> -			cfg->state = STATE_NORMAL;
> -		wake_up_all(&cfg->reset_waitq);
> -		scsi_unblock_requests(cfg->host);
> -		break;
> -	case STATE_RESET:
> -		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
> -		if (cfg->state == STATE_NORMAL)
> -			break;
> -		/* fall through */
> -	default:
> -		rc = FAILED;
> -		break;
> +		free_pages((ulong)afu, get_order(sizeof(struct afu)));
> +		cfg->afu = NULL;
>  	}
> -
> -	pr_debug("%s: returning rc=%d\n", __func__, rc);
> -	return rc;
>  }
>  
>  /**
> - * cxlflash_change_queue_depth() - change the queue depth for the device
> - * @sdev:	SCSI device destined for queue depth change.
> - * @qdepth:	Requested queue depth value to set.
> - *
> - * The requested queue depth is capped to the maximum supported value.
> + * stop_afu() - stops the AFU command timers and unmaps the MMIO space
> + * @cxlflash:	Internal structure associated with the host.
>   *
> - * Return: The actual queue depth set.
> + * Safe to call with AFU in a partially allocated/initialized state.
>   */
> -static int cxlflash_change_queue_depth(struct scsi_device *sdev, int qdepth)
> +static void stop_afu(struct cxlflash_cfg *cfg)
>  {
> +	int i;
> +	struct afu *afu = cfg->afu;
>  
> -	if (qdepth > CXLFLASH_MAX_CMDS_PER_LUN)
> -		qdepth = CXLFLASH_MAX_CMDS_PER_LUN;
> -
> -	scsi_change_queue_depth(sdev, qdepth);
> -	return sdev->queue_depth;
> -}
> -
> -/**
> - * cxlflash_show_port_status() - queries and presents the current port status
> - * @dev:	Generic device associated with the host owning the port.
> - * @attr:	Device attribute representing the port.
> - * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
> - *
> - * Return: The size of the ASCII string returned in @buf.
> - */
> -static ssize_t cxlflash_show_port_status(struct device *dev,
> -					 struct device_attribute *attr,
> -					 char *buf)
> -{
> -	struct Scsi_Host *shost = class_to_shost(dev);
> -	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> -	struct afu *afu = cfg->afu;
> -
> -	char *disp_status;
> -	int rc;
> -	u32 port;
> -	u64 status;
> -	u64 *fc_regs;
> -
> -	rc = kstrtouint((attr->attr.name + 4), 10, &port);
> -	if (rc || (port >= NUM_FC_PORTS))
> -		return 0;
> -
> -	fc_regs = &afu->afu_map->global.fc_regs[port][0];
> -	status =
> -	    (readq_be(&fc_regs[FC_MTIP_STATUS / 8]) & FC_MTIP_STATUS_MASK);
> -
> -	if (status == FC_MTIP_STATUS_ONLINE)
> -		disp_status = "online";
> -	else if (status == FC_MTIP_STATUS_OFFLINE)
> -		disp_status = "offline";
> -	else
> -		disp_status = "unknown";
> -
> -	return snprintf(buf, PAGE_SIZE, "%s\n", disp_status);
> -}
> -
> -/**
> - * cxlflash_show_lun_mode() - presents the current LUN mode of the host
> - * @dev:	Generic device associated with the host.
> - * @attr:	Device attribute representing the lun mode.
> - * @buf:	Buffer of length PAGE_SIZE to report back the LUN mode in ASCII.
> - *
> - * Return: The size of the ASCII string returned in @buf.
> - */
> -static ssize_t cxlflash_show_lun_mode(struct device *dev,
> -				      struct device_attribute *attr, char *buf)
> -{
> -	struct Scsi_Host *shost = class_to_shost(dev);
> -	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> -	struct afu *afu = cfg->afu;
> -
> -	return snprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
> -}
> -
> -/**
> - * cxlflash_store_lun_mode() - sets the LUN mode of the host
> - * @dev:	Generic device associated with the host.
> - * @attr:	Device attribute representing the lun mode.
> - * @buf:	Buffer of length PAGE_SIZE containing the LUN mode in ASCII.
> - * @count:	Length of data resizing in @buf.
> - *
> - * The CXL Flash AFU supports a dummy LUN mode where the external
> - * links and storage are not required. Space on the FPGA is used
> - * to create 1 or 2 small LUNs which are presented to the system
> - * as if they were a normal storage device. This feature is useful
> - * during development and also provides manufacturing with a way
> - * to test the AFU without an actual device.
> - *
> - * 0 = external LUN[s] (default)
> - * 1 = internal LUN (1 x 64K, 512B blocks, id 0)
> - * 2 = internal LUN (1 x 64K, 4K blocks, id 0)
> - * 3 = internal LUN (2 x 32K, 512B blocks, ids 0,1)
> - * 4 = internal LUN (2 x 32K, 4K blocks, ids 0,1)
> - *
> - * Return: The size of the ASCII string returned in @buf.
> - */
> -static ssize_t cxlflash_store_lun_mode(struct device *dev,
> -				       struct device_attribute *attr,
> -				       const char *buf, size_t count)
> -{
> -	struct Scsi_Host *shost = class_to_shost(dev);
> -	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> -	struct afu *afu = cfg->afu;
> -	int rc;
> -	u32 lun_mode;
> -
> -	rc = kstrtouint(buf, 10, &lun_mode);
> -	if (!rc && (lun_mode < 5) && (lun_mode != afu->internal_lun)) {
> -		afu->internal_lun = lun_mode;
> -		cxlflash_afu_reset(cfg);
> -		scsi_scan_host(cfg->host);
> -	}
> -
> -	return count;
> -}
> -
> -/**
> - * cxlflash_show_ioctl_version() - presents the current ioctl version of the host
> - * @dev:	Generic device associated with the host.
> - * @attr:	Device attribute representing the ioctl version.
> - * @buf:	Buffer of length PAGE_SIZE to report back the ioctl version.
> - *
> - * Return: The size of the ASCII string returned in @buf.
> - */
> -static ssize_t cxlflash_show_ioctl_version(struct device *dev,
> -					   struct device_attribute *attr,
> -					   char *buf)
> -{
> -	return scnprintf(buf, PAGE_SIZE, "%u\n", DK_CXLFLASH_VERSION_0);
> -}
> -
> -/**
> - * cxlflash_show_dev_mode() - presents the current mode of the device
> - * @dev:	Generic device associated with the device.
> - * @attr:	Device attribute representing the device mode.
> - * @buf:	Buffer of length PAGE_SIZE to report back the dev mode in ASCII.
> - *
> - * Return: The size of the ASCII string returned in @buf.
> - */
> -static ssize_t cxlflash_show_dev_mode(struct device *dev,
> -				      struct device_attribute *attr, char *buf)
> -{
> -	struct scsi_device *sdev = to_scsi_device(dev);
> -
> -	return snprintf(buf, PAGE_SIZE, "%s\n",
> -			sdev->hostdata ? "superpipe" : "legacy");
> -}
> -
> -/**
> - * cxlflash_wait_for_pci_err_recovery() - wait for error recovery during probe
> - * @cxlflash:	Internal structure associated with the host.
> - */
> -static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
> -{
> -	struct pci_dev *pdev = cfg->dev;
> -
> -	if (pci_channel_offline(pdev))
> -		wait_event_timeout(cfg->reset_waitq,
> -				   !pci_channel_offline(pdev),
> -				   CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT);
> -}
> -
> -/*
> - * Host attributes
> - */
> -static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
> -static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
> -static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
> -		   cxlflash_store_lun_mode);
> -static DEVICE_ATTR(ioctl_version, S_IRUGO, cxlflash_show_ioctl_version, NULL);
> -
> -static struct device_attribute *cxlflash_host_attrs[] = {
> -	&dev_attr_port0,
> -	&dev_attr_port1,
> -	&dev_attr_lun_mode,
> -	&dev_attr_ioctl_version,
> -	NULL
> -};
> -
> -/*
> - * Device attributes
> - */
> -static DEVICE_ATTR(mode, S_IRUGO, cxlflash_show_dev_mode, NULL);
> -
> -static struct device_attribute *cxlflash_dev_attrs[] = {
> -	&dev_attr_mode,
> -	NULL
> -};
> -
> -/*
> - * Host template
> - */
> -static struct scsi_host_template driver_template = {
> -	.module = THIS_MODULE,
> -	.name = CXLFLASH_ADAPTER_NAME,
> -	.info = cxlflash_driver_info,
> -	.ioctl = cxlflash_ioctl,
> -	.proc_name = CXLFLASH_NAME,
> -	.queuecommand = cxlflash_queuecommand,
> -	.eh_device_reset_handler = cxlflash_eh_device_reset_handler,
> -	.eh_host_reset_handler = cxlflash_eh_host_reset_handler,
> -	.change_queue_depth = cxlflash_change_queue_depth,
> -	.cmd_per_lun = 16,
> -	.can_queue = CXLFLASH_MAX_CMDS,
> -	.this_id = -1,
> -	.sg_tablesize = SG_NONE,	/* No scatter gather support. */
> -	.max_sectors = CXLFLASH_MAX_SECTORS,
> -	.use_clustering = ENABLE_CLUSTERING,
> -	.shost_attrs = cxlflash_host_attrs,
> -	.sdev_attrs = cxlflash_dev_attrs,
> -};
> -
> -/*
> - * Device dependent values
> - */
> -static struct dev_dependent_vals dev_corsa_vals = { CXLFLASH_MAX_SECTORS };
> -
> -/*
> - * PCI device binding table
> - */
> -static struct pci_device_id cxlflash_pci_table[] = {
> -	{PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CORSA,
> -	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, (kernel_ulong_t)&dev_corsa_vals},
> -	{}
> -};
> -
> -MODULE_DEVICE_TABLE(pci, cxlflash_pci_table);
> -
> -/**
> - * free_mem() - free memory associated with the AFU
> - * @cxlflash:	Internal structure associated with the host.
> - */
> -static void free_mem(struct cxlflash_cfg *cfg)
> -{
> -	int i;
> -	char *buf = NULL;
> -	struct afu *afu = cfg->afu;
> -
> -	if (cfg->afu) {
> -		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
> -			buf = afu->cmd[i].buf;
> -			if (!((u64)buf & (PAGE_SIZE - 1)))
> -				free_page((ulong)buf);
> -		}
> -
> -		free_pages((ulong)afu, get_order(sizeof(struct afu)));
> -		cfg->afu = NULL;
> -	}
> -}
> -
> -/**
> - * stop_afu() - stops the AFU command timers and unmaps the MMIO space
> - * @cxlflash:	Internal structure associated with the host.
> - *
> - * Safe to call with AFU in a partially allocated/initialized state.
> - */
> -static void stop_afu(struct cxlflash_cfg *cfg)
> -{
> -	int i;
> -	struct afu *afu = cfg->afu;
> -
> -	if (likely(afu)) {
> -		for (i = 0; i < CXLFLASH_NUM_CMDS; i++)
> -			complete(&afu->cmd[i].cevent);
> +	if (likely(afu)) {
> +		for (i = 0; i < CXLFLASH_NUM_CMDS; i++)
> +			complete(&afu->cmd[i].cevent);
>  
>  		if (likely(afu->afu_map)) {
>  			cxl_psa_unmap((void *)afu->afu_map);
> @@ -1640,67 +1466,13 @@ out:
>  }
>  
>  /**
> - * cxlflash_context_reset() - timeout handler for AFU commands
> - * @cmd:	AFU command that timed out.
> + * init_pcr() - initialize the provisioning and control registers
> + * @cxlflash:	Internal structure associated with the host.
>   *
> - * Sends a reset to the AFU.
> + * Also sets up fast access to the mapped registers and initializes AFU
> + * command fields that never change.
>   */
> -void cxlflash_context_reset(struct afu_cmd *cmd)
> -{
> -	int nretry = 0;
> -	u64 rrin = 0x1;
> -	u64 room = 0;
> -	struct afu *afu = cmd->parent;
> -	ulong lock_flags;
> -
> -	pr_debug("%s: cmd=%p\n", __func__, cmd);
> -
> -	spin_lock_irqsave(&cmd->slock, lock_flags);
> -
> -	/* Already completed? */
> -	if (cmd->sa.host_use_b[0] & B_DONE) {
> -		spin_unlock_irqrestore(&cmd->slock, lock_flags);
> -		return;
> -	}
> -
> -	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
> -	spin_unlock_irqrestore(&cmd->slock, lock_flags);
> -
> -	/*
> -	 * We really want to send this reset at all costs, so spread
> -	 * out wait time on successive retries for available room.
> -	 */
> -	do {
> -		room = readq_be(&afu->host_map->cmd_room);
> -		atomic64_set(&afu->room, room);
> -		if (room)
> -			goto write_rrin;
> -		udelay(nretry);
> -	} while (nretry++ < MC_ROOM_RETRY_CNT);
> -
> -	pr_err("%s: no cmd_room to send reset\n", __func__);
> -	return;
> -
> -write_rrin:
> -	nretry = 0;
> -	writeq_be(rrin, &afu->host_map->ioarrin);
> -	do {
> -		rrin = readq_be(&afu->host_map->ioarrin);
> -		if (rrin != 0x1)
> -			break;
> -		/* Double delay each time */
> -		udelay(2 ^ nretry);
> -	} while (nretry++ < MC_ROOM_RETRY_CNT);
> -}
> -
> -/**
> - * init_pcr() - initialize the provisioning and control registers
> - * @cxlflash:	Internal structure associated with the host.
> - *
> - * Also sets up fast access to the mapped registers and initializes AFU
> - * command fields that never change.
> - */
> -void init_pcr(struct cxlflash_cfg *cfg)
> +static void init_pcr(struct cxlflash_cfg *cfg)
>  {
>  	struct afu *afu = cfg->afu;
>  	struct sisl_ctrl_map *ctrl_map;
> @@ -1736,7 +1508,7 @@ void init_pcr(struct cxlflash_cfg *cfg)
>   * init_global() - initialize AFU global registers
>   * @cxlflash:	Internal structure associated with the host.
>   */
> -int init_global(struct cxlflash_cfg *cfg)
> +static int init_global(struct cxlflash_cfg *cfg)
>  {
>  	struct afu *afu = cfg->afu;
>  	u64 wwpn[NUM_FC_PORTS];	/* wwpn of AFU ports */
> @@ -2007,92 +1779,6 @@ err1:
>  }
>  
>  /**
> - * cxlflash_send_cmd() - sends an AFU command
> - * @afu:	AFU associated with the host.
> - * @cmd:	AFU command to send.
> - *
> - * Return:
> - *	0 on success
> - *	-1 on failure
> - */
> -int cxlflash_send_cmd(struct afu *afu, struct afu_cmd *cmd)
> -{
> -	struct cxlflash_cfg *cfg = afu->parent;
> -	int nretry = 0;
> -	int rc = 0;
> -	u64 room;
> -	long newval;
> -
> -	/*
> -	 * This routine is used by critical users such an AFU sync and to
> -	 * send a task management function (TMF). Thus we want to retry a
> -	 * bit before returning an error. To avoid the performance penalty
> -	 * of MMIO, we spread the update of 'room' over multiple commands.
> -	 */
> -retry:
> -	newval = atomic64_dec_if_positive(&afu->room);
> -	if (!newval) {
> -		do {
> -			room = readq_be(&afu->host_map->cmd_room);
> -			atomic64_set(&afu->room, room);
> -			if (room)
> -				goto write_ioarrin;
> -			udelay(nretry);
> -		} while (nretry++ < MC_ROOM_RETRY_CNT);
> -
> -		pr_err("%s: no cmd_room to send 0x%X\n",
> -		       __func__, cmd->rcb.cdb[0]);
> -
> -		goto no_room;
> -	} else if (unlikely(newval < 0)) {
> -		/* This should be rare. i.e. Only if two threads race and
> -		 * decrement before the MMIO read is done. In this case
> -		 * just benefit from the other thread having updated
> -		 * afu->room.
> -		 */
> -		if (nretry++ < MC_ROOM_RETRY_CNT) {
> -			udelay(nretry);
> -			goto retry;
> -		}
> -
> -		goto no_room;
> -	}
> -
> -write_ioarrin:
> -	writeq_be((u64)&cmd->rcb, &afu->host_map->ioarrin);
> -out:
> -	pr_debug("%s: cmd=%p len=%d ea=%p rc=%d\n", __func__, cmd,
> -		 cmd->rcb.data_len, (void *)cmd->rcb.data_ea, rc);
> -	return rc;
> -
> -no_room:
> -	afu->read_room = true;
> -	schedule_work(&cfg->work_q);
> -	rc = SCSI_MLQUEUE_HOST_BUSY;
> -	goto out;
> -}
> -
> -/**
> - * cxlflash_wait_resp() - polls for a response or timeout to a sent AFU command
> - * @afu:	AFU associated with the host.
> - * @cmd:	AFU command that was sent.
> - */
> -void cxlflash_wait_resp(struct afu *afu, struct afu_cmd *cmd)
> -{
> -	ulong timeout = jiffies + (cmd->rcb.timeout * 2 * HZ);
> -
> -	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
> -	if (!timeout)
> -		cxlflash_context_reset(cmd);
> -
> -	if (unlikely(cmd->sa.ioasc != 0))
> -		pr_err("%s: CMD 0x%X failed, IOASC: flags 0x%X, afu_rc 0x%X, "
> -		       "scsi_rc 0x%X, fc_rc 0x%X\n", __func__, cmd->rcb.cdb[0],
> -		       cmd->sa.rc.flags, cmd->sa.rc.afu_rc, cmd->sa.rc.scsi_rc,
> -		       cmd->sa.rc.fc_rc);
> -}
> -
> -/**
>   * cxlflash_afu_sync() - builds and sends an AFU sync command
>   * @afu:	AFU associated with the host.
>   * @ctx_hndl_u:	Identifies context requesting sync.
> @@ -2130,7 +1816,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
>  
>  	mutex_lock(&sync_active);
>  retry:
> -	cmd = cxlflash_cmd_checkout(afu);
> +	cmd = cmd_checkout(afu);
>  	if (unlikely(!cmd)) {
>  		retry_cnt++;
>  		udelay(1000 * retry_cnt);
> @@ -2159,11 +1845,11 @@ retry:
>  	*((u16 *)&cmd->rcb.cdb[2]) = swab16(ctx_hndl_u);
>  	*((u32 *)&cmd->rcb.cdb[4]) = swab32(res_hndl_u);
>  
> -	rc = cxlflash_send_cmd(afu, cmd);
> +	rc = send_cmd(afu, cmd);
>  	if (unlikely(rc))
>  		goto out;
>  
> -	cxlflash_wait_resp(afu, cmd);
> +	wait_resp(afu, cmd);
>  
>  	/* set on timeout */
>  	if (unlikely((cmd->sa.ioasc != 0) ||
> @@ -2172,20 +1858,20 @@ retry:
>  out:
>  	mutex_unlock(&sync_active);
>  	if (cmd)
> -		cxlflash_cmd_checkin(cmd);
> +		cmd_checkin(cmd);
>  	pr_debug("%s: returning rc=%d\n", __func__, rc);
>  	return rc;
>  }
>  
>  /**
> - * cxlflash_afu_reset() - resets the AFU
> - * @cxlflash:	Internal structure associated with the host.
> + * afu_reset() - resets the AFU
> + * @cfg:	Internal structure associated with the host.
>   *
>   * Return:
>   *	0 on success
>   *	A failure value from internal services.
>   */
> -int cxlflash_afu_reset(struct cxlflash_cfg *cfg)
> +static int afu_reset(struct cxlflash_cfg *cfg)
>  {
>  	int rc = 0;
>  	/* Stop the context before the reset. Since the context is
> @@ -2201,6 +1887,320 @@ int cxlflash_afu_reset(struct cxlflash_cfg *cfg)
>  }
>  
>  /**
> + * cxlflash_eh_device_reset_handler() - reset a single LUN
> + * @scp:	SCSI command to send.
> + *
> + * Return:
> + *	SUCCESS as defined in scsi/scsi.h
> + *	FAILED as defined in scsi/scsi.h
> + */
> +static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
> +{
> +	int rc = SUCCESS;
> +	struct Scsi_Host *host = scp->device->host;
> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
> +	struct afu *afu = cfg->afu;
> +	int rcr = 0;
> +
> +	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
> +		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
> +		 host->host_no, scp->device->channel,
> +		 scp->device->id, scp->device->lun,
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
> +
> +	switch (cfg->state) {
> +	case STATE_NORMAL:
> +		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
> +		if (unlikely(rcr))
> +			rc = FAILED;
> +		break;
> +	case STATE_RESET:
> +		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
> +		if (cfg->state == STATE_NORMAL)
> +			break;
> +		/* fall through */
> +	default:
> +		rc = FAILED;
> +		break;
> +	}
> +
> +	pr_debug("%s: returning rc=%d\n", __func__, rc);
> +	return rc;
> +}
> +
> +/**
> + * cxlflash_eh_host_reset_handler() - reset the host adapter
> + * @scp:	SCSI command from stack identifying host.
> + *
> + * Return:
> + *	SUCCESS as defined in scsi/scsi.h
> + *	FAILED as defined in scsi/scsi.h
> + */
> +static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
> +{
> +	int rc = SUCCESS;
> +	int rcr = 0;
> +	struct Scsi_Host *host = scp->device->host;
> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
> +
> +	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
> +		 "cdb=(%08X-%08X-%08X-%08X)\n", __func__, scp,
> +		 host->host_no, scp->device->channel,
> +		 scp->device->id, scp->device->lun,
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
> +		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
> +
> +	switch (cfg->state) {
> +	case STATE_NORMAL:
> +		cfg->state = STATE_RESET;
> +		scsi_block_requests(cfg->host);
> +		cxlflash_mark_contexts_error(cfg);
> +		rcr = afu_reset(cfg);
> +		if (rcr) {
> +			rc = FAILED;
> +			cfg->state = STATE_FAILTERM;
> +		} else
> +			cfg->state = STATE_NORMAL;
> +		wake_up_all(&cfg->reset_waitq);
> +		scsi_unblock_requests(cfg->host);
> +		break;
> +	case STATE_RESET:
> +		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
> +		if (cfg->state == STATE_NORMAL)
> +			break;
> +		/* fall through */
> +	default:
> +		rc = FAILED;
> +		break;
> +	}
> +
> +	pr_debug("%s: returning rc=%d\n", __func__, rc);
> +	return rc;
> +}
> +
> +/**
> + * cxlflash_change_queue_depth() - change the queue depth for the device
> + * @sdev:	SCSI device destined for queue depth change.
> + * @qdepth:	Requested queue depth value to set.
> + *
> + * The requested queue depth is capped to the maximum supported value.
> + *
> + * Return: The actual queue depth set.
> + */
> +static int cxlflash_change_queue_depth(struct scsi_device *sdev, int qdepth)
> +{
> +
> +	if (qdepth > CXLFLASH_MAX_CMDS_PER_LUN)
> +		qdepth = CXLFLASH_MAX_CMDS_PER_LUN;
> +
> +	scsi_change_queue_depth(sdev, qdepth);
> +	return sdev->queue_depth;
> +}
> +
> +/**
> + * cxlflash_show_port_status() - queries and presents the current port status
> + * @dev:	Generic device associated with the host owning the port.
> + * @attr:	Device attribute representing the port.
> + * @buf:	Buffer of length PAGE_SIZE to report back port status in ASCII.
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_show_port_status(struct device *dev,
> +					 struct device_attribute *attr,
> +					 char *buf)
> +{
> +	struct Scsi_Host *shost = class_to_shost(dev);
> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> +	struct afu *afu = cfg->afu;
> +
> +	char *disp_status;
> +	int rc;
> +	u32 port;
> +	u64 status;
> +	u64 *fc_regs;
> +
> +	rc = kstrtouint((attr->attr.name + 4), 10, &port);
> +	if (rc || (port >= NUM_FC_PORTS))
> +		return 0;
> +
> +	fc_regs = &afu->afu_map->global.fc_regs[port][0];
> +	status =
> +	    (readq_be(&fc_regs[FC_MTIP_STATUS / 8]) & FC_MTIP_STATUS_MASK);
> +
> +	if (status == FC_MTIP_STATUS_ONLINE)
> +		disp_status = "online";
> +	else if (status == FC_MTIP_STATUS_OFFLINE)
> +		disp_status = "offline";
> +	else
> +		disp_status = "unknown";
> +
> +	return snprintf(buf, PAGE_SIZE, "%s\n", disp_status);
> +}
> +
> +/**
> + * cxlflash_show_lun_mode() - presents the current LUN mode of the host
> + * @dev:	Generic device associated with the host.
> + * @attr:	Device attribute representing the lun mode.
> + * @buf:	Buffer of length PAGE_SIZE to report back the LUN mode in ASCII.
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_show_lun_mode(struct device *dev,
> +				      struct device_attribute *attr, char *buf)
> +{
> +	struct Scsi_Host *shost = class_to_shost(dev);
> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> +	struct afu *afu = cfg->afu;
> +
> +	return snprintf(buf, PAGE_SIZE, "%u\n", afu->internal_lun);
> +}
> +
> +/**
> + * cxlflash_store_lun_mode() - sets the LUN mode of the host
> + * @dev:	Generic device associated with the host.
> + * @attr:	Device attribute representing the lun mode.
> + * @buf:	Buffer of length PAGE_SIZE containing the LUN mode in ASCII.
> + * @count:	Length of data resizing in @buf.
> + *
> + * The CXL Flash AFU supports a dummy LUN mode where the external
> + * links and storage are not required. Space on the FPGA is used
> + * to create 1 or 2 small LUNs which are presented to the system
> + * as if they were a normal storage device. This feature is useful
> + * during development and also provides manufacturing with a way
> + * to test the AFU without an actual device.
> + *
> + * 0 = external LUN[s] (default)
> + * 1 = internal LUN (1 x 64K, 512B blocks, id 0)
> + * 2 = internal LUN (1 x 64K, 4K blocks, id 0)
> + * 3 = internal LUN (2 x 32K, 512B blocks, ids 0,1)
> + * 4 = internal LUN (2 x 32K, 4K blocks, ids 0,1)
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_store_lun_mode(struct device *dev,
> +				       struct device_attribute *attr,
> +				       const char *buf, size_t count)
> +{
> +	struct Scsi_Host *shost = class_to_shost(dev);
> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)shost->hostdata;
> +	struct afu *afu = cfg->afu;
> +	int rc;
> +	u32 lun_mode;
> +
> +	rc = kstrtouint(buf, 10, &lun_mode);
> +	if (!rc && (lun_mode < 5) && (lun_mode != afu->internal_lun)) {
> +		afu->internal_lun = lun_mode;
> +		afu_reset(cfg);
> +		scsi_scan_host(cfg->host);
> +	}
> +
> +	return count;
> +}
> +
> +/**
> + * cxlflash_show_ioctl_version() - presents the hosts current ioctl version
> + * @dev:	Generic device associated with the host.
> + * @attr:	Device attribute representing the ioctl version.
> + * @buf:	Buffer of length PAGE_SIZE to report back the ioctl version.
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_show_ioctl_version(struct device *dev,
> +					   struct device_attribute *attr,
> +					   char *buf)
> +{
> +	return scnprintf(buf, PAGE_SIZE, "%u\n", DK_CXLFLASH_VERSION_0);
> +}
> +
> +/**
> + * cxlflash_show_dev_mode() - presents the current mode of the device
> + * @dev:	Generic device associated with the device.
> + * @attr:	Device attribute representing the device mode.
> + * @buf:	Buffer of length PAGE_SIZE to report back the dev mode in ASCII.
> + *
> + * Return: The size of the ASCII string returned in @buf.
> + */
> +static ssize_t cxlflash_show_dev_mode(struct device *dev,
> +				      struct device_attribute *attr, char *buf)
> +{
> +	struct scsi_device *sdev = to_scsi_device(dev);
> +
> +	return snprintf(buf, PAGE_SIZE, "%s\n",
> +			sdev->hostdata ? "superpipe" : "legacy");
> +}
> +
> +/*
> + * Host attributes
> + */
> +static DEVICE_ATTR(port0, S_IRUGO, cxlflash_show_port_status, NULL);
> +static DEVICE_ATTR(port1, S_IRUGO, cxlflash_show_port_status, NULL);
> +static DEVICE_ATTR(lun_mode, S_IRUGO | S_IWUSR, cxlflash_show_lun_mode,
> +		   cxlflash_store_lun_mode);
> +static DEVICE_ATTR(ioctl_version, S_IRUGO, cxlflash_show_ioctl_version, NULL);
> +
> +static struct device_attribute *cxlflash_host_attrs[] = {
> +	&dev_attr_port0,
> +	&dev_attr_port1,
> +	&dev_attr_lun_mode,
> +	&dev_attr_ioctl_version,
> +	NULL
> +};
> +
> +/*
> + * Device attributes
> + */
> +static DEVICE_ATTR(mode, S_IRUGO, cxlflash_show_dev_mode, NULL);
> +
> +static struct device_attribute *cxlflash_dev_attrs[] = {
> +	&dev_attr_mode,
> +	NULL
> +};
> +
> +/*
> + * Host template
> + */
> +static struct scsi_host_template driver_template = {
> +	.module = THIS_MODULE,
> +	.name = CXLFLASH_ADAPTER_NAME,
> +	.info = cxlflash_driver_info,
> +	.ioctl = cxlflash_ioctl,
> +	.proc_name = CXLFLASH_NAME,
> +	.queuecommand = cxlflash_queuecommand,
> +	.eh_device_reset_handler = cxlflash_eh_device_reset_handler,
> +	.eh_host_reset_handler = cxlflash_eh_host_reset_handler,
> +	.change_queue_depth = cxlflash_change_queue_depth,
> +	.cmd_per_lun = 16,
> +	.can_queue = CXLFLASH_MAX_CMDS,
> +	.this_id = -1,
> +	.sg_tablesize = SG_NONE,	/* No scatter gather support. */
> +	.max_sectors = CXLFLASH_MAX_SECTORS,
> +	.use_clustering = ENABLE_CLUSTERING,
> +	.shost_attrs = cxlflash_host_attrs,
> +	.sdev_attrs = cxlflash_dev_attrs,
> +};
> +
> +/*
> + * Device dependent values
> + */
> +static struct dev_dependent_vals dev_corsa_vals = { CXLFLASH_MAX_SECTORS };
> +
> +/*
> + * PCI device binding table
> + */
> +static struct pci_device_id cxlflash_pci_table[] = {
> +	{PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CORSA,
> +	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, (kernel_ulong_t)&dev_corsa_vals},
> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(pci, cxlflash_pci_table);
> +
> +/**
>   * cxlflash_worker_thread() - work thread handler for the AFU
>   * @work:	Work structure contained within cxlflash associated with host.
>   *

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory
  2015-09-16 21:31 ` [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory Matthew R. Ochs
@ 2015-09-21 12:25   ` Tomas Henzl
  2015-09-21 22:44     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Tomas Henzl @ 2015-09-21 12:25 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Brian King, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 16.9.2015 23:31, Matthew R. Ochs wrote:
> The workq can process work in parallel with a remove event, leading
> to a condition where the workq handler can access freed memory.
>
> To remedy, the workq should be terminated prior to freeing memory. Move
> the termination call earlier in remove and use cancel_work_sync() instead
> of flush_work() as there is not a need to process any scheduled work when
> shutting down.
>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 1856a73..1625aea 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -736,12 +736,12 @@ static void cxlflash_remove(struct pci_dev *pdev)
>  		scsi_remove_host(cfg->host);
>  		/* Fall through */
>  	case INIT_STATE_AFU:
> +		cancel_work_sync(&cfg->work_q);
>  		term_afu(cfg);

You disable irqs after a call to cancel_work_sync.
That means a late int could trigger the workqueue again?
Please disable irqs earlier - as described in Documentation/PCI/pci.txt

>  	case INIT_STATE_PCI:
>  		pci_release_regions(cfg->dev);
>  		pci_disable_device(pdev);
>  	case INIT_STATE_NONE:
> -		flush_work(&cfg->work_q);
>  		free_mem(cfg);
>  		scsi_host_put(cfg->host);
>  		break;

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 29/30] cxlflash: Fix to avoid state change collision
  2015-09-16 21:32 ` [PATCH v2 29/30] cxlflash: Fix to avoid state change collision Matthew R. Ochs
@ 2015-09-21 12:44   ` Tomas Henzl
  2015-09-21 22:59     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Tomas Henzl @ 2015-09-21 12:44 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Brian King, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 16.9.2015 23:32, Matthew R. Ochs wrote:
> The adapter state machine is susceptible to missing and/or
> corrupting state updates at runtime. This can lead to a variety
> of unintended issues and is due to the lack of a serialization
> mechanism to protect the adapter state.
>
> Use an adapter-wide mutex to serialize state changes.

I've just briefly looked into your code, but it seems to me that
an atomic variable would serve your needs also and might be 
more effective resulting in a faster code execution?

If you keep the mutex way you don't need two mutexes
in cxlflash_afu_sync - you should remove the mutex &sync_active

--tm

>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/common.h    |  1 +
>  drivers/scsi/cxlflash/main.c      | 40 +++++++++++++++++++++++++++++++++------
>  drivers/scsi/cxlflash/superpipe.c |  7 ++++++-
>  3 files changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
> index e6041b9..c9b1ec6 100644
> --- a/drivers/scsi/cxlflash/common.h
> +++ b/drivers/scsi/cxlflash/common.h
> @@ -128,6 +128,7 @@ struct cxlflash_cfg {
>  	bool tmf_active;
>  	wait_queue_head_t reset_waitq;
>  	enum cxlflash_state state;
> +	struct mutex mutex;
>  };
>  
>  struct afu_cmd {
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 0487fac..a94340d 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -496,6 +496,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
>  	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
>  	struct afu *afu = cfg->afu;
>  	struct device *dev = &cfg->dev->dev;
> +	enum cxlflash_state state;
>  	struct afu_cmd *cmd;
>  	u32 port_sel = scp->device->channel + 1;
>  	int nseg, i, ncount;
> @@ -525,7 +526,11 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
>  	}
>  	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
>  
> -	switch (cfg->state) {
> +	mutex_lock(&cfg->mutex);
> +	state = cfg->state;
> +	mutex_unlock(&cfg->mutex);
> +
> +	switch (state) {
>  	case STATE_RESET:
>  		dev_dbg_ratelimited(dev, "%s: device is in reset!\n", __func__);
>  		rc = SCSI_MLQUEUE_HOST_BUSY;
> @@ -722,7 +727,9 @@ static void cxlflash_remove(struct pci_dev *pdev)
>  						  cfg->tmf_slock);
>  	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
>  
> +	mutex_lock(&cfg->mutex);
>  	cfg->state = STATE_FAILTERM;
> +	mutex_unlock(&cfg->mutex);
>  	atomic_inc(&cfg->remove_active);
>  	cxlflash_stop_term_user_contexts(cfg);
>  
> @@ -1811,12 +1818,13 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
>  	int retry_cnt = 0;
>  	static DEFINE_MUTEX(sync_active);
>  
> +	mutex_lock(&sync_active);
> +	mutex_lock(&cfg->mutex);
>  	if (cfg->state != STATE_NORMAL) {
>  		pr_debug("%s: Sync not required! (%u)\n", __func__, cfg->state);
> -		return 0;
> +		goto out;
>  	}
>  
> -	mutex_lock(&sync_active);
>  retry:
>  	cmd = cmd_checkout(afu);
>  	if (unlikely(!cmd)) {
> @@ -1858,6 +1866,7 @@ retry:
>  		     (cmd->sa.host_use_b[0] & B_ERROR)))
>  		rc = -1;
>  out:
> +	mutex_unlock(&cfg->mutex);
>  	mutex_unlock(&sync_active);
>  	if (cmd)
>  		cmd_checkin(cmd);
> @@ -1900,6 +1909,7 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
>  	struct Scsi_Host *host = scp->device->host;
>  	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
>  	struct afu *afu = cfg->afu;
> +	enum cxlflash_state state;
>  	int rcr = 0;
>  
>  	pr_debug("%s: (scp=%p) %d/%d/%d/%llu "
> @@ -1912,7 +1922,11 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
>  		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
>  
>  retry:
> -	switch (cfg->state) {
> +	mutex_lock(&cfg->mutex);
> +	state = cfg->state;
> +	mutex_unlock(&cfg->mutex);
> +
> +	switch (state) {
>  	case STATE_NORMAL:
>  		rcr = send_tmf(afu, scp, TMF_LUN_RESET);
>  		if (unlikely(rcr))
> @@ -1954,6 +1968,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
>  		 get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
>  		 get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
>  
> +	mutex_lock(&cfg->mutex);
>  	switch (cfg->state) {
>  	case STATE_NORMAL:
>  		cfg->state = STATE_RESET;
> @@ -1967,7 +1982,9 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
>  		wake_up_all(&cfg->reset_waitq);
>  		break;
>  	case STATE_RESET:
> +		mutex_unlock(&cfg->mutex);
>  		wait_event(cfg->reset_waitq, cfg->state != STATE_RESET);
> +		mutex_lock(&cfg->mutex);
>  		if (cfg->state == STATE_NORMAL)
>  			break;
>  		/* fall through */
> @@ -1975,6 +1992,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp)
>  		rc = FAILED;
>  		break;
>  	}
> +	mutex_unlock(&cfg->mutex);
>  
>  	pr_debug("%s: returning rc=%d\n", __func__, rc);
>  	return rc;
> @@ -2312,10 +2330,11 @@ static void cxlflash_worker_thread(struct work_struct *work)
>  	int port;
>  	ulong lock_flags;
>  
> -	/* Avoid MMIO if the device has failed */
> +	mutex_lock(&cfg->mutex);
>  
> +	/* Avoid MMIO if the device has failed */
>  	if (cfg->state != STATE_NORMAL)
> -		return;
> +		goto out;
>  
>  	spin_lock_irqsave(cfg->host->host_lock, lock_flags);
>  
> @@ -2346,6 +2365,8 @@ static void cxlflash_worker_thread(struct work_struct *work)
>  
>  	if (atomic_dec_if_positive(&cfg->scan_host_needed) >= 0)
>  		scsi_scan_host(cfg->host);
> +out:
> +	mutex_unlock(&cfg->mutex);
>  }
>  
>  /**
> @@ -2416,6 +2437,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
>  	INIT_WORK(&cfg->work_q, cxlflash_worker_thread);
>  	cfg->lr_state = LINK_RESET_INVALID;
>  	cfg->lr_port = -1;
> +	mutex_init(&cfg->mutex);
>  	mutex_init(&cfg->ctx_tbl_list_mutex);
>  	mutex_init(&cfg->ctx_recovery_mutex);
>  	init_rwsem(&cfg->ioctl_rwsem);
> @@ -2503,7 +2525,9 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
>  
>  	switch (state) {
>  	case pci_channel_io_frozen:
> +		mutex_lock(&cfg->mutex);
>  		cfg->state = STATE_RESET;
> +		mutex_unlock(&cfg->mutex);
>  		scsi_block_requests(cfg->host);
>  		drain_ioctls(cfg);
>  		rc = cxlflash_mark_contexts_error(cfg);
> @@ -2514,7 +2538,9 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
>  		stop_afu(cfg);
>  		return PCI_ERS_RESULT_NEED_RESET;
>  	case pci_channel_io_perm_failure:
> +		mutex_lock(&cfg->mutex);
>  		cfg->state = STATE_FAILTERM;
> +		mutex_unlock(&cfg->mutex);
>  		wake_up_all(&cfg->reset_waitq);
>  		scsi_unblock_requests(cfg->host);
>  		return PCI_ERS_RESULT_DISCONNECT;
> @@ -2561,7 +2587,9 @@ static void cxlflash_pci_resume(struct pci_dev *pdev)
>  
>  	dev_dbg(dev, "%s: pdev=%p\n", __func__, pdev);
>  
> +	mutex_lock(&cfg->mutex);
>  	cfg->state = STATE_NORMAL;
> +	mutex_unlock(&cfg->mutex);
>  	wake_up_all(&cfg->reset_waitq);
>  	scsi_unblock_requests(cfg->host);
>  }
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index 9844788..c3aaadf 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -1229,10 +1229,15 @@ static const struct file_operations null_fops = {
>  static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
>  {
>  	struct device *dev = &cfg->dev->dev;
> +	enum cxlflash_state state;
>  	int rc = 0;
>  
>  retry:
> -	switch (cfg->state) {
> +	mutex_lock(&cfg->mutex);
> +	state = cfg->state;
> +	mutex_unlock(&cfg->mutex);
> +
> +	switch (state) {
>  	case STATE_RESET:
>  		dev_dbg(dev, "%s: Reset state, going to wait...\n", __func__);
>  		if (ioctl)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF
  2015-09-16 21:30 ` [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF Matthew R. Ochs
@ 2015-09-21 18:24   ` Brian King
  2015-09-21 23:05     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Brian King @ 2015-09-21 18:24 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
> Borrowing the TMF waitq's spinlock causes a stall condition when
> waiting for the TMF to complete. To remedy, introduce our own spin
> lock to serialize TMF and use the appropriate wait services.

Can you clarify what stall condition you were seeing. Its not obvious
to me what this fixes. Do you have softlockup logs from the failure?

-Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 15/30] cxlflash: Fix location of setting resid
  2015-09-16 21:30 ` [PATCH v2 15/30] cxlflash: Fix location of setting resid Matthew R. Ochs
@ 2015-09-21 18:28   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-21 18:28 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
> @@ -158,8 +160,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
>  				/* If the SISL_RC_FLAGS_OVERRUN flag was set,
>  				 * then we will handle this error else where.
>  				 * If not then we must handle it here.
> -				 * This is probably an AFU bug. We will
> -				 * attempt a retry to see if that resolves it.
> +				 * This is probably an AFU bug.

I would tend to agree with this statement. ioasa->resid should be zero in an overrun case.

>  				 */
>  				scp->result = (DID_ERROR << 16);
>  			}

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 16/30] cxlflash: Fix host link up event handling
  2015-09-16 21:30 ` [PATCH v2 16/30] cxlflash: Fix host link up event handling Matthew R. Ochs
@ 2015-09-21 21:47   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-21 21:47 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic
  2015-09-16 21:30 ` [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic Matthew R. Ochs
@ 2015-09-21 21:48   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-21 21:48 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 18/30] cxlflash: Remove dual port online dependency
  2015-09-16 21:30 ` [PATCH v2 18/30] cxlflash: Remove dual port online dependency Matthew R. Ochs
@ 2015-09-21 22:02   ` Brian King
  2015-09-22 20:44     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Brian King @ 2015-09-21 22:02 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
> At present, both ports must be online for the device to
> configure properly. Remove this dependency and the unnecessary
> internal LUN override logic as well. Additionally, as a refactoring
> measure, change the return code variable name to match that used
> throughout the driver.

Doesn't this also change the behavior to no longer fail init_afu even
if BOTH ports fail to go offline in the reconfig case. Is that OK?

-Brian

> 
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/main.c | 23 ++++++++---------------
>  1 file changed, 8 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 74eb742..e2cc410 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -1031,7 +1031,7 @@ static int wait_port_offline(u64 *fc_regs, u32 delay_us, u32 nretry)
>   */
>  static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
>  {
> -	int ret = 0;
> +	int rc = 0;
> 
>  	set_port_offline(fc_regs);
> 
> @@ -1039,33 +1039,26 @@ static int afu_set_wwpn(struct afu *afu, int port, u64 *fc_regs, u64 wwpn)
>  			       FC_PORT_STATUS_RETRY_CNT)) {
>  		pr_debug("%s: wait on port %d to go offline timed out\n",
>  			 __func__, port);
> -		ret = -1; /* but continue on to leave the port back online */
> +		rc = -1; /* but continue on to leave the port back online */
>  	}
> 
> -	if (ret == 0)
> +	if (rc == 0)
>  		writeq_be(wwpn, &fc_regs[FC_PNAME / 8]);
> 
> +	/* Always return success after programming WWPN */
> +	rc = 0;
> +
>  	set_port_online(fc_regs);
> 
>  	if (!wait_port_online(fc_regs, FC_PORT_STATUS_RETRY_INTERVAL_US,
>  			      FC_PORT_STATUS_RETRY_CNT)) {
>  		pr_debug("%s: wait on port %d to go online timed out\n",
>  			 __func__, port);
> -		ret = -1;
> -
> -		/*
> -		 * Override for internal lun!!!
> -		 */
> -		if (afu->internal_lun) {
> -			pr_debug("%s: Overriding port %d online timeout!!!\n",
> -				 __func__, port);
> -			ret = 0;
> -		}
>  	}
> 
> -	pr_debug("%s: returning rc=%d\n", __func__, ret);
> +	pr_debug("%s: returning rc=%d\n", __func__, rc);
> 
> -	return ret;
> +	return rc;
>  }
> 
>  /**
> 


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 03/30] cxlflash: Fix read capacity timeout
  2015-09-21 11:36   ` Tomas Henzl
@ 2015-09-21 22:11     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 22:11 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	linuxppc-dev, Manoj Kumar, Manoj N. Kumar

> On Sep 21, 2015, at 6:36 AM, Tomas Henzl <thenzl@redhat.com> wrote:
> On 16.9.2015 23:26, Matthew R. Ochs wrote:
>> From: Manoj Kumar <kumarmn@us.ibm.com>
>>=20
>> The timeout value for read capacity is too small. Certain devices
>> may take longer to respond and thus the command may prematurely
>> timeout. Additionally the literal used for the timeout is stale.
>>=20
>> Update the timeout to 30 seconds (matches the value used in sd.c)
>> and rework the timeout literal to a more appropriate description.
>>=20
>> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
>> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
>> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/cxlflash/superpipe.c | 9 ++++-----
>> drivers/scsi/cxlflash/superpipe.h | 2 +-
>> drivers/scsi/cxlflash/vlun.c      | 4 ++--
>> 3 files changed, 7 insertions(+), 8 deletions(-)
>>=20
>> diff --git a/drivers/scsi/cxlflash/superpipe.c =
b/drivers/scsi/cxlflash/superpipe.c
>> index 7df985d..fa513ba 100644
>> --- a/drivers/scsi/cxlflash/superpipe.c
>> +++ b/drivers/scsi/cxlflash/superpipe.c
>> @@ -296,7 +296,7 @@ static int read_cap16(struct scsi_device *sdev, =
struct llun_info *lli)
>> 	int rc =3D 0;
>> 	int result =3D 0;
>> 	int retry_cnt =3D 0;
>> -	u32 tout =3D (MC_DISCOVERY_TIMEOUT * HZ);
>> +	u32 to =3D (CMD_TIMEOUT * HZ);
>=20
> In V3 please remove the parenthesis here^

Sure.

>=20
>>=20
>> retry:
>> 	cmd_buf =3D kzalloc(CMD_BUFSIZE, GFP_KERNEL);
> ...
>=20
>> @@ -1376,8 +1375,8 @@ out_attach:
>> 	attach->block_size =3D gli->blk_len;
>> 	attach->mmio_size =3D sizeof(afu->afu_map->hosts[0].harea);
>> 	attach->last_lba =3D gli->max_lba;
>> -	attach->max_xfer =3D (sdev->host->max_sectors * MAX_SECTOR_UNIT) =
/
>> -		gli->blk_len;
>> +	attach->max_xfer =3D (sdev->host->max_sectors * =
MAX_SECTOR_UNIT);
>=20
> and here^ too.

done.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal
  2015-09-21 12:11   ` Tomas Henzl
@ 2015-09-21 22:32     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 22:32 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	linuxppc-dev, Manoj N. Kumar

> On Sep 21, 2015, at 7:11 AM, Tomas Henzl <thenzl@redhat.com> wrote:
> On 16.9.2015 23:27, Matthew R. Ochs wrote:
>> When a LUN is removed, the sdev that is associated with the LUN
>> remains intact until its reference count drops to 0. In order
>> to prevent an sdev from being removed while a context is still
>> associated with it, obtain an additional reference per-context
>> for each LUN attached to the context.
>>=20
>> This resolves a potential Oops in the release handler when a
>> dealing with a LUN that has already been removed.
>>=20
>> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
>> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
>> Suggested-by: Brian King <brking@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/cxlflash/superpipe.c | 36 =
++++++++++++++++++++++++------------
>> 1 file changed, 24 insertions(+), 12 deletions(-)
>>=20
>> diff --git a/drivers/scsi/cxlflash/superpipe.c =
b/drivers/scsi/cxlflash/superpipe.c
>> index fa513ba..1fa4af6 100644
>> --- a/drivers/scsi/cxlflash/superpipe.c
>> +++ b/drivers/scsi/cxlflash/superpipe.c
>> @@ -880,6 +880,9 @@ static int _cxlflash_disk_detach(struct =
scsi_device *sdev,
>> 			sys_close(lfd);
>> 	}
>>=20
>> +	/* Release the sdev reference that bound this LUN to the context =
*/
>> +	scsi_device_put(sdev);
>> +
>=20
> I'm not sure here with the use if scsi_device_get+put, also I don't =
quite well
> understand what you are going to fix here and how can it happen.
> The scsi_device_get takes an additional module reference, so if used =
from
> a module it shouldn't be held for a long time.

The issue here is that the user context needs to be bound to the device =
so that
in the event that device goes away, it doesn't completely go away until =
the user
context is done using it. Without it, it is possible to crash when the =
context is
being freed.

Essentially this is the same as incrementing the count when an open is =
performed
on the device. The device can be removed (and is hidden upon doing so) =
but is
not actually freed until the reference is resolved (close()).

> Is it possible for a user to rmmod the czlflash module
> after the disk attach function is called?

Not while a user is present.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 11/30] cxlflash: Make functions static
  2015-09-21 12:18   ` Tomas Henzl
@ 2015-09-21 22:36     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 22:36 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	linuxppc-dev, Manoj N. Kumar

> On Sep 21, 2015, at 7:18 AM, Tomas Henzl <thenzl@redhat.com> wrote:
> On 16.9.2015 23:28, Matthew R. Ochs wrote:
>> 
>> +
>> +write_rrin:
>> +	nretry = 0;
>> +	writeq_be(rrin, &afu->host_map->ioarrin);
>> +	do {
>> +		rrin = readq_be(&afu->host_map->ioarrin);
>> +		if (rrin != 0x1)
>> +			break;
>> +		/* Double delay each time */
>> +		udelay(2 ^ nretry);
> 
> Double delay - isn't another operator needed?
> If so, pleas add a new patch for this.
> --tm

Good catch. This was an oversight. Will fix in a new patch in v3.


-matt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory
  2015-09-21 12:25   ` Tomas Henzl
@ 2015-09-21 22:44     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 22:44 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	linuxppc-dev, Manoj N. Kumar

> On Sep 21, 2015, at 7:25 AM, Tomas Henzl <thenzl@redhat.com> wrote:
> On 16.9.2015 23:31, Matthew R. Ochs wrote:
>> The workq can process work in parallel with a remove event, leading
>> to a condition where the workq handler can access freed memory.
>> 
>> To remedy, the workq should be terminated prior to freeing memory. Move
>> the termination call earlier in remove and use cancel_work_sync() instead
>> of flush_work() as there is not a need to process any scheduled work when
>> shutting down.
>> 
>> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
>> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/cxlflash/main.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
>> index 1856a73..1625aea 100644
>> --- a/drivers/scsi/cxlflash/main.c
>> +++ b/drivers/scsi/cxlflash/main.c
>> @@ -736,12 +736,12 @@ static void cxlflash_remove(struct pci_dev *pdev)
>> 		scsi_remove_host(cfg->host);
>> 		/* Fall through */
>> 	case INIT_STATE_AFU:
>> +		cancel_work_sync(&cfg->work_q);
>> 		term_afu(cfg);
> 
> You disable irqs after a call to cancel_work_sync.
> That means a late int could trigger the workqueue again?
> Please disable irqs earlier - as described in Documentation/PCI/pci.txt

I'll change the order here such that the work is cancelled after
term_afu() is called.

-matt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 29/30] cxlflash: Fix to avoid state change collision
  2015-09-21 12:44   ` Tomas Henzl
@ 2015-09-21 22:59     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 22:59 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Brian King,
	Ian Munsie, Daniel Axtens, Andrew Donnellan, Michael Neuling,
	linuxppc-dev, Manoj N. Kumar

> On Sep 21, 2015, at 7:44 AM, Tomas Henzl <thenzl@redhat.com> wrote:
> On 16.9.2015 23:32, Matthew R. Ochs wrote:
>> The adapter state machine is susceptible to missing and/or
>> corrupting state updates at runtime. This can lead to a variety
>> of unintended issues and is due to the lack of a serialization
>> mechanism to protect the adapter state.
>> 
>> Use an adapter-wide mutex to serialize state changes.
> 
> I've just briefly looked into your code, but it seems to me that
> an atomic variable would serve your needs also and might be 
> more effective resulting in a faster code execution?

Will keep this in mind. 

> If you keep the mutex way you don't need two mutexes
> in cxlflash_afu_sync - you should remove the mutex &sync_active

Agreed. Will remove as part of this patch in v3.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF
  2015-09-21 18:24   ` Brian King
@ 2015-09-21 23:05     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-21 23:05 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

> On Sep 21, 2015, at 1:24 PM, Brian King <brking@linux.vnet.ibm.com> wrote:
> On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
>> Borrowing the TMF waitq's spinlock causes a stall condition when
>> waiting for the TMF to complete. To remedy, introduce our own spin
>> lock to serialize TMF and use the appropriate wait services.
> 
> Can you clarify what stall condition you were seeing. Its not obvious
> to me what this fixes. Do you have soft lockup logs from the failure?

I believe we saw cascading RCU stalls.

I couldn't find any more details in my notes or development commits.
Unfortunately the logs are long gone as this was fixed in June.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 18/30] cxlflash: Remove dual port online dependency
  2015-09-21 22:02   ` Brian King
@ 2015-09-22 20:44     ` Matthew R. Ochs
  2015-09-22 20:50       ` Brian King
  0 siblings, 1 reply; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-22 20:44 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

> On Sep 21, 2015, at 5:02 PM, Brian King <brking@linux.vnet.ibm.com> wrote:
> On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
>> At present, both ports must be online for the device to
>> configure properly. Remove this dependency and the unnecessary
>> internal LUN override logic as well. Additionally, as a refactoring
>> measure, change the return code variable name to match that used
>> throughout the driver.
> 
> Doesn't this also change the behavior to no longer fail init_afu even
> if BOTH ports fail to go offline in the reconfig case. Is that OK?

Correct, there is a change in behavior but it is not an issue.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check
  2015-09-16 21:30 ` [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check Matthew R. Ochs
@ 2015-09-22 20:47   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-22 20:47 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 18/30] cxlflash: Remove dual port online dependency
  2015-09-22 20:44     ` Matthew R. Ochs
@ 2015-09-22 20:50       ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-22 20:50 UTC (permalink / raw)
  To: Matthew R. Ochs
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

On 09/22/2015 03:44 PM, Matthew R. Ochs wrote:
>> On Sep 21, 2015, at 5:02 PM, Brian King <brking@linux.vnet.ibm.com> wrote:
>> On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
>>> At present, both ports must be online for the device to
>>> configure properly. Remove this dependency and the unnecessary
>>> internal LUN override logic as well. Additionally, as a refactoring
>>> measure, change the return code variable name to match that used
>>> throughout the driver.
>>
>> Doesn't this also change the behavior to no longer fail init_afu even
>> if BOTH ports fail to go offline in the reconfig case. Is that OK?
> 
> Correct, there is a change in behavior but it is not an issue.
> 

ok.

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put()
  2015-09-16 21:30 ` [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put() Matthew R. Ochs
@ 2015-09-22 20:53   ` Brian King
  2015-09-22 21:49     ` Matthew R. Ochs
  0 siblings, 1 reply; 79+ messages in thread
From: Brian King @ 2015-09-22 20:53 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
> Currently, scsi_host_put() is being called prematurely in the
> remove path and is missing entirely in an error cleanup path.
> The former can lead to memory being freed too early with
> subsequent access potentially corrupting data whilst the former
> would result in a memory leak.
> 
> Move the usage on remove to be the last cleanup action taken
> and introduce a call to scsi_host_put() in the one initialization
> error path that does not use remove to cleanup.
> 
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index fc77cd4..1856a73 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -734,7 +734,6 @@ static void cxlflash_remove(struct pci_dev *pdev)
>  	case INIT_STATE_SCSI:
>  		cxlflash_term_local_luns(cfg);
>  		scsi_remove_host(cfg->host);
> -		scsi_host_put(cfg->host);
>  		/* Fall through */
>  	case INIT_STATE_AFU:
>  		term_afu(cfg);
> @@ -744,6 +743,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
>  	case INIT_STATE_NONE:
>  		flush_work(&cfg->work_q);
>  		free_mem(cfg);
> +		scsi_host_put(cfg->host);
>  		break;
>  	}
> 
> @@ -2415,6 +2415,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
>  		dev_err(&pdev->dev, "%s: call to scsi_host_alloc failed!\n",

This message text is wrong. Its the call to alloc_mem that has failed in this
leg, not the call to scsi_host_alloc.

>  			__func__);
>  		rc = -ENOMEM;
> +		scsi_host_put(cfg->host);
>  		goto out;
>  	}
> 


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH
  2015-09-16 21:31 ` [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH Matthew R. Ochs
@ 2015-09-22 20:58   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-22 20:58 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests
  2015-09-16 21:31 ` [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests Matthew R. Ochs
@ 2015-09-22 20:59   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-22 20:59 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes
  2015-09-16 21:31 ` [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes Matthew R. Ochs
@ 2015-09-22 21:02   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-22 21:02 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put()
  2015-09-22 20:53   ` Brian King
@ 2015-09-22 21:49     ` Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-22 21:49 UTC (permalink / raw)
  To: Brian King
  Cc: linux-scsi, James Bottomley, Nicholas A. Bellinger, Ian Munsie,
	Daniel Axtens, Andrew Donnellan, Michael Neuling, linuxppc-dev,
	Manoj N. Kumar

> On Sep 22, 2015, at 3:53 PM, Brian King <brking@linux.vnet.ibm.com> =
wrote:
> On 09/16/2015 04:30 PM, Matthew R. Ochs wrote:
>> Currently, scsi_host_put() is being called prematurely in the
>> remove path and is missing entirely in an error cleanup path.
>> The former can lead to memory being freed too early with
>> subsequent access potentially corrupting data whilst the former
>> would result in a memory leak.
>>=20
>> Move the usage on remove to be the last cleanup action taken
>> and introduce a call to scsi_host_put() in the one initialization
>> error path that does not use remove to cleanup.
>>=20
>> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
>> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/cxlflash/main.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>=20
>> diff --git a/drivers/scsi/cxlflash/main.c =
b/drivers/scsi/cxlflash/main.c
>> index fc77cd4..1856a73 100644
>> --- a/drivers/scsi/cxlflash/main.c
>> +++ b/drivers/scsi/cxlflash/main.c
>> @@ -734,7 +734,6 @@ static void cxlflash_remove(struct pci_dev *pdev)
>> 	case INIT_STATE_SCSI:
>> 		cxlflash_term_local_luns(cfg);
>> 		scsi_remove_host(cfg->host);
>> -		scsi_host_put(cfg->host);
>> 		/* Fall through */
>> 	case INIT_STATE_AFU:
>> 		term_afu(cfg);
>> @@ -744,6 +743,7 @@ static void cxlflash_remove(struct pci_dev *pdev)
>> 	case INIT_STATE_NONE:
>> 		flush_work(&cfg->work_q);
>> 		free_mem(cfg);
>> +		scsi_host_put(cfg->host);
>> 		break;
>> 	}
>>=20
>> @@ -2415,6 +2415,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
>> 		dev_err(&pdev->dev, "%s: call to scsi_host_alloc =
failed!\n",
>=20
> This message text is wrong. Its the call to alloc_mem that has failed =
in this
> leg, not the call to scsi_host_alloc.

Good find. I'll fix this in a separate patch.


-matt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors
  2015-09-16 21:32 ` [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors Matthew R. Ochs
@ 2015-09-23 15:03   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-23 15:03 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure
  2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
@ 2015-09-23 19:09   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-23 19:09 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes
  2015-09-16 21:32 ` [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes Matthew R. Ochs
@ 2015-09-23 19:13   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-23 19:13 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ
  2015-09-16 21:32 ` [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs
@ 2015-09-23 19:18   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-23 19:18 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver
  2015-09-16 21:33 ` [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver Matthew R. Ochs
@ 2015-09-23 19:19   ` Brian King
  0 siblings, 0 replies; 79+ messages in thread
From: Brian King @ 2015-09-23 19:19 UTC (permalink / raw)
  To: Matthew R. Ochs, linux-scsi, James Bottomley,
	Nicholas A. Bellinger, Ian Munsie, Daniel Axtens,
	Andrew Donnellan
  Cc: Michael Neuling, linuxppc-dev, Manoj N. Kumar

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ
@ 2015-09-16 17:05 Matthew R. Ochs
  0 siblings, 0 replies; 79+ messages in thread
From: Matthew R. Ochs @ 2015-09-16 17:05 UTC (permalink / raw)
  To: linux-scsi, James.Bottomley, nab, brking, imunsie, dja, andrew.donnellan
  Cc: mikey, linuxppc-dev, Manoj N. Kumar

Following an adapter reset, the AFU RRQ that resides in host memory
holds stale data. This can lead to a condition where the RRQ interrupt
handler tries to process stale entries and/or endlessly loops due to an
out of sync generation bit.

To fix, the AFU RRQ in host memory needs to be cleared after each reset.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index a5b45ed..0487fac 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1609,6 +1609,9 @@ static int start_afu(struct cxlflash_cfg *cfg)
 
 	init_pcr(cfg);
 
+	/* After an AFU reset, RRQ entries are stale, clear them */
+	memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry));
+
 	/* Initialize RRQ pointers */
 	afu->hrrq_start = &afu->rrq_entry[0];
 	afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2015-09-23 20:46 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-16 21:23 [PATCH v2 00/30] cxlflash: Miscellaneous bug fixes and corrections Matthew R. Ochs
2015-09-16 21:25 ` [PATCH v2 01/30] cxlflash: Fix to avoid invalid port_sel value Matthew R. Ochs
2015-09-18  1:16   ` Brian King
2015-09-16 21:26 ` [PATCH v2 02/30] cxlflash: Replace magic numbers with literals Matthew R. Ochs
2015-09-18  1:18   ` Brian King
2015-09-16 21:26 ` [PATCH v2 03/30] cxlflash: Fix read capacity timeout Matthew R. Ochs
2015-09-18  1:21   ` Brian King
2015-09-21 11:36   ` Tomas Henzl
2015-09-21 22:11     ` Matthew R. Ochs
2015-09-16 21:27 ` [PATCH v2 04/30] cxlflash: Fix potential oops following LUN removal Matthew R. Ochs
2015-09-18  1:26   ` Brian King
2015-09-18 23:18     ` Matthew R. Ochs
2015-09-21 12:11   ` Tomas Henzl
2015-09-21 22:32     ` Matthew R. Ochs
2015-09-16 21:27 ` [PATCH v2 05/30] cxlflash: Fix data corruption when vLUN used over multiple cards Matthew R. Ochs
2015-09-18  1:28   ` Brian King
2015-09-16 21:27 ` [PATCH v2 06/30] cxlflash: Fix to avoid sizeof(bool) Matthew R. Ochs
2015-09-18  1:29   ` Brian King
2015-09-16 21:27 ` [PATCH v2 07/30] cxlflash: Fix context encode mask width Matthew R. Ochs
2015-09-18  1:29   ` Brian King
2015-09-16 21:27 ` [PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH Matthew R. Ochs
2015-09-18 13:37   ` Brian King
2015-09-18 23:54     ` Matthew R. Ochs
2015-09-16 21:28 ` [PATCH v2 09/30] cxlflash: Fix to stop interrupt processing on remove Matthew R. Ochs
2015-09-17 11:58   ` David Laight
2015-09-17 16:55     ` Matthew R. Ochs
2015-09-16 21:28 ` [PATCH v2 10/30] cxlflash: Correct naming of limbo state and waitq Matthew R. Ochs
2015-09-18 15:28   ` Brian King
2015-09-16 21:28 ` [PATCH v2 11/30] cxlflash: Make functions static Matthew R. Ochs
2015-09-18 15:34   ` Brian King
2015-09-21 12:18   ` Tomas Henzl
2015-09-21 22:36     ` Matthew R. Ochs
2015-09-16 21:29 ` [PATCH v2 12/30] cxlflash: Refine host/device attributes Matthew R. Ochs
2015-09-18 21:34   ` Brian King
2015-09-18 23:56     ` Matthew R. Ochs
2015-09-21  9:55     ` David Laight
2015-09-16 21:30 ` [PATCH v2 13/30] cxlflash: Fix to avoid spamming the kernel log Matthew R. Ochs
2015-09-18 21:39   ` Brian King
2015-09-16 21:30 ` [PATCH v2 14/30] cxlflash: Fix to avoid stall while waiting on TMF Matthew R. Ochs
2015-09-21 18:24   ` Brian King
2015-09-21 23:05     ` Matthew R. Ochs
2015-09-16 21:30 ` [PATCH v2 15/30] cxlflash: Fix location of setting resid Matthew R. Ochs
2015-09-21 18:28   ` Brian King
2015-09-16 21:30 ` [PATCH v2 16/30] cxlflash: Fix host link up event handling Matthew R. Ochs
2015-09-21 21:47   ` Brian King
2015-09-16 21:30 ` [PATCH v2 17/30] cxlflash: Fix async interrupt bypass logic Matthew R. Ochs
2015-09-21 21:48   ` Brian King
2015-09-16 21:30 ` [PATCH v2 18/30] cxlflash: Remove dual port online dependency Matthew R. Ochs
2015-09-21 22:02   ` Brian King
2015-09-22 20:44     ` Matthew R. Ochs
2015-09-22 20:50       ` Brian King
2015-09-16 21:30 ` [PATCH v2 19/30] cxlflash: Fix AFU version access/storage and add check Matthew R. Ochs
2015-09-22 20:47   ` Brian King
2015-09-16 21:30 ` [PATCH v2 20/30] cxlflash: Correct usage of scsi_host_put() Matthew R. Ochs
2015-09-22 20:53   ` Brian King
2015-09-22 21:49     ` Matthew R. Ochs
2015-09-16 21:31 ` [PATCH v2 21/30] cxlflash: Fix to prevent workq from accessing freed memory Matthew R. Ochs
2015-09-21 12:25   ` Tomas Henzl
2015-09-21 22:44     ` Matthew R. Ochs
2015-09-16 21:31 ` [PATCH v2 22/30] cxlflash: Correct behavior in device reset handler following EEH Matthew R. Ochs
2015-09-22 20:58   ` Brian King
2015-09-16 21:31 ` [PATCH v2 23/30] cxlflash: Remove unnecessary scsi_block_requests Matthew R. Ochs
2015-09-22 20:59   ` Brian King
2015-09-16 21:31 ` [PATCH v2 24/30] cxlflash: Fix function prolog parameters and return codes Matthew R. Ochs
2015-09-22 21:02   ` Brian King
2015-09-16 21:32 ` [PATCH v2 25/30] cxlflash: Fix MMIO and endianness errors Matthew R. Ochs
2015-09-23 15:03   ` Brian King
2015-09-16 21:32 ` [PATCH v2 26/30] cxlflash: Fix to prevent EEH recovery failure Matthew R. Ochs
2015-09-23 19:09   ` Brian King
2015-09-16 21:32 ` [PATCH v2 27/30] cxlflash: Correct spelling, grammar, and alignment mistakes Matthew R. Ochs
2015-09-23 19:13   ` Brian King
2015-09-16 21:32 ` [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs
2015-09-23 19:18   ` Brian King
2015-09-16 21:32 ` [PATCH v2 29/30] cxlflash: Fix to avoid state change collision Matthew R. Ochs
2015-09-21 12:44   ` Tomas Henzl
2015-09-21 22:59     ` Matthew R. Ochs
2015-09-16 21:33 ` [PATCH v2 30/30] MAINTAINERS: Add cxlflash driver Matthew R. Ochs
2015-09-23 19:19   ` Brian King
  -- strict thread matches above, loose matches on Subject: below --
2015-09-16 17:05 [PATCH v2 28/30] cxlflash: Fix to prevent stale AFU RRQ Matthew R. Ochs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).