linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging
@ 2016-11-15 23:12 Uma Krishnan
  2016-11-15 23:13 ` [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE Uma Krishnan
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:12 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

The first four patches in this patch series include fixes for command
room violation and lun table management.

The remaining patches remove the reliance upon an internally maintained
private command pool in favor of private commands being allocated
alongside the SCSI commands. Several cleanup opportunities were noticed
while removing the private command pool infrastructure and have been
included as well. Lastly, the final two patches provide staging for
supporting hardware with a different queuing model.

The series is based upon 4.9-rc5, intended for 4.10 and is bisectable.

Matthew R. Ochs (10):
  cxlflash: Remove unused buffer from AFU command
  cxlflash: Allocate memory instead of using command pool for AFU sync
  cxlflash: Use cmd_size for private commands
  cxlflash: Remove private command pool
  cxlflash: Wait for active AFU commands to timeout upon tear down
  cxlflash: Remove AFU command lock
  cxlflash: Cleanup send_tmf()
  cxlflash: Cleanup queuecommand()
  cxlflash: Migrate IOARRIN specific routines to function pointers
  cxlflash: Migrate scsi command pointer to AFU command

Uma Krishnan (4):
  cxlflash: Set sg_tablesize to 1 instead of SG_NONE
  cxlflash: Fix crash in cxlflash_restore_luntable()
  cxlflash: Improve context_reset() logic
  cxlflash: Avoid command room violation

 drivers/scsi/cxlflash/common.h  |  35 ++--
 drivers/scsi/cxlflash/lunmgt.c  |   6 +
 drivers/scsi/cxlflash/main.c    | 360 +++++++++++-----------------------------
 drivers/scsi/cxlflash/sislite.h |   2 +-
 4 files changed, 120 insertions(+), 283 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
@ 2016-11-15 23:13 ` Uma Krishnan
  2016-11-17 19:20   ` Matthew R. Ochs
  2016-11-15 23:14 ` [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable() Uma Krishnan
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:13 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

The following Oops is encountered when blk_mq is enabled with the
cxlflash driver:

[ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5]
[ 2960.817309] NIP  __blk_mq_run_hw_queue+0x278/0x4c0
[ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0
[ 2960.817314] Call Trace:
[ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable)
[ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100
[ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0
[ 2960.817333] blk_mq_flush_plug_list+0x150/0x190
[ 2960.817338] blk_flush_plug_list+0x11c/0x2b0
[ 2960.817344] blk_finish_plug+0x58/0x80
[ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0
[ 2960.817352] force_page_cache_readahead+0x68/0xd0
[ 2960.817356] generic_file_read_iter+0x43c/0x6a0
[ 2960.817359] blkdev_read_iter+0x68/0xa0
[ 2960.817361] __vfs_read+0x11c/0x180
[ 2960.817364] vfs_read+0xa4/0x1c0
[ 2960.817366] SyS_read+0x6c/0x110
[ 2960.817369] system_call+0x38/0xb4

The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero
value with scsi_mq_setup_tags() allocating tags using sg_tablesize.
The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize
as the devices it supports are not capable of scatter gather. This
mismatch of values results in the Oops above.

To resolve this issue, sg_tablesize for cxlflash can simply be set
to 1, a value which satisfies the constraints in cxlflash and the
lack of support of SG_NONE in SCSI blk_mq.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index b301655..6004860 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -2377,7 +2377,7 @@ static struct scsi_host_template driver_template = {
 	.cmd_per_lun = CXLFLASH_MAX_CMDS_PER_LUN,
 	.can_queue = CXLFLASH_MAX_CMDS,
 	.this_id = -1,
-	.sg_tablesize = SG_NONE,	/* No scatter gather support */
+	.sg_tablesize = 1,	/* No scatter gather support */
 	.max_sectors = CXLFLASH_MAX_SECTORS,
 	.use_clustering = ENABLE_CLUSTERING,
 	.shost_attrs = cxlflash_host_attrs,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable()
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
  2016-11-15 23:13 ` [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE Uma Krishnan
@ 2016-11-15 23:14 ` Uma Krishnan
  2016-11-17 19:20   ` Matthew R. Ochs
  2016-11-15 23:14 ` [PATCH 03/14] cxlflash: Improve context_reset() logic Uma Krishnan
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:14 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

During test, the following crash was observed:

[34538.981505] Faulting instruction address: 0xd000000007c9c870
cpu 0x9: Vector: 300 (Data Access) at [c0000007f1e8f590]
    pc: d000000007c9c870: cxlflash_restore_luntable+0x70/0x1d0 [cxlflash]
    lr: d000000007c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 [cxlflash]
    sp: c0000007f1e8f810
   msr: 9000000100009033
   dar: c00000171d637438
 dsisr: 40000000
  current = 0xc0000007f1e43f90
  paca    = 0xc000000007b25100   softe: 0        irq_happened: 0x01
    pid   = 493, comm = eehd
enter ? for help
[c0000007f1e8f8a0] d000000007c940b0 init_afu+0xd60/0x1200 [cxlflash]
[c0000007f1e8f9a0] d000000007c945a8 cxlflash_pci_slot_reset+0x58/0xe0 [cxlflash]
[c0000007f1e8fa20] d00000000715f790 cxl_pci_slot_reset+0x230/0x340 [cxl]
[c0000007f1e8fae0] c000000000040dd4 eeh_report_reset+0x144/0x180
[c0000007f1e8fb20] c00000000003f708 eeh_pe_dev_traverse+0x98/0x170
[c0000007f1e8fbb0] c000000000041618 eeh_handle_normal_event+0x328/0x410
[c0000007f1e8fc30] c000000000041db8 eeh_handle_event+0x178/0x330
[c0000007f1e8fce0] c000000000042118 eeh_event_handler+0x1a8/0x1b0
[c0000007f1e8fd80] c00000000011420c kthread+0xec/0x100
[c0000007f1e8fe30] c00000000000a47c ret_from_kernel_thread+0x5c/0xe0

When superpipe mode is disabled for a LUN, the references for the
local lun are deleted but the LUN is still identified as being present
in the LUN table. This mismatched state can result in the above crash
when the LUN table is restored during an error recovery operation.

To fix this issue, the local LUN information structure is updated to
reflect the LUN is no longer in the LUN table once all references to
the LUN are gone.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/lunmgt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/cxlflash/lunmgt.c b/drivers/scsi/cxlflash/lunmgt.c
index a0923ca..6c318db9 100644
--- a/drivers/scsi/cxlflash/lunmgt.c
+++ b/drivers/scsi/cxlflash/lunmgt.c
@@ -254,8 +254,14 @@ int cxlflash_manage_lun(struct scsi_device *sdev,
 		if (lli->parent->mode != MODE_NONE)
 			rc = -EBUSY;
 		else {
+			/*
+			 * Clean up local LUN for this port and reset table
+			 * tracking when no more references exist.
+			 */
 			sdev->hostdata = NULL;
 			lli->port_sel &= ~CHAN2PORT(chan);
+			if (lli->port_sel == 0U)
+				lli->in_table = false;
 		}
 	}
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/14] cxlflash: Improve context_reset() logic
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
  2016-11-15 23:13 ` [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE Uma Krishnan
  2016-11-15 23:14 ` [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable() Uma Krishnan
@ 2016-11-15 23:14 ` Uma Krishnan
  2016-11-17 19:21   ` Matthew R. Ochs
  2016-11-15 23:14 ` [PATCH 04/14] cxlflash: Avoid command room violation Uma Krishnan
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:14 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

Currently, the context reset routine waits for command room to
be available before sending the reset request. Per review of the
SISLite specification and clarifications from the CXL Flash AFU
designers, this wait is unnecessary. The reset request can be
sent anytime regardless of command room, so long as only a single
reset request is active at any one point in time.

This commit simplifies the reset routine by removing the wait for
command room. Additionally it adds a debug trace to help pinpoint
hardware errors when a context reset does not complete.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 23 +++++------------------
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 6004860..6d33d8c 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -263,8 +263,9 @@ static void context_reset(struct afu_cmd *cmd)
 {
 	int nretry = 0;
 	u64 rrin = 0x1;
-	u64 room = 0;
 	struct afu *afu = cmd->parent;
+	struct cxlflash_cfg *cfg = afu->parent;
+	struct device *dev = &cfg->dev->dev;
 	ulong lock_flags;
 
 	pr_debug("%s: cmd=%p\n", __func__, cmd);
@@ -280,23 +281,6 @@ static void context_reset(struct afu_cmd *cmd)
 	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
 	spin_unlock_irqrestore(&cmd->slock, lock_flags);
 
-	/*
-	 * We really want to send this reset at all costs, so spread
-	 * out wait time on successive retries for available room.
-	 */
-	do {
-		room = readq_be(&afu->host_map->cmd_room);
-		atomic64_set(&afu->room, room);
-		if (room)
-			goto write_rrin;
-		udelay(1 << nretry);
-	} while (nretry++ < MC_ROOM_RETRY_CNT);
-
-	pr_err("%s: no cmd_room to send reset\n", __func__);
-	return;
-
-write_rrin:
-	nretry = 0;
 	writeq_be(rrin, &afu->host_map->ioarrin);
 	do {
 		rrin = readq_be(&afu->host_map->ioarrin);
@@ -305,6 +289,9 @@ static void context_reset(struct afu_cmd *cmd)
 		/* Double delay each time */
 		udelay(1 << nretry);
 	} while (nretry++ < MC_ROOM_RETRY_CNT);
+
+	dev_dbg(dev, "%s: returning rrin=0x%016llX nretry=%d\n",
+		__func__, rrin, nretry);
 }
 
 /**
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/14] cxlflash: Avoid command room violation
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (2 preceding siblings ...)
  2016-11-15 23:14 ` [PATCH 03/14] cxlflash: Improve context_reset() logic Uma Krishnan
@ 2016-11-15 23:14 ` Uma Krishnan
  2016-11-17 19:36   ` Matthew R. Ochs
  2016-11-15 23:14 ` [PATCH 05/14] cxlflash: Remove unused buffer from AFU command Uma Krishnan
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:14 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

During test, a command room violation interrupt is occasionally seen
for the master context when the CXL flash devices are stressed.

After studying the code, there could be gaps in the way command room
value is being cached in cxlflash. When the cached command room is zero
the thread attempting to send becomes burdened with updating the cached
value with the actual value from the AFU. Today, this is handled with
an atomic set operation of the raw value read. Following the atomic
update, the thread proceeds to send.

This behavior is incorrect on two counts:

   - The update fails to take into account the current thread and its
     consumption of one of the hardware commands.

   - The update does not take into account other threads also atomically
     updating. Per design, a worker thread updates the cached value when
     a send thread times out. By not performing an atomic compare/exchange,
     the cached value can be incorrectly clobbered.

To correct these issues, the runtime updates of the cached command room
are updated to use atomic64_cmpxchg() and the send routine is updated to
take into account the current thread consuming a hardware command.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 6d33d8c..1a32e8b 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -322,9 +322,10 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
 	if (!newval) {
 		do {
 			room = readq_be(&afu->host_map->cmd_room);
-			atomic64_set(&afu->room, room);
-			if (room)
-				goto write_ioarrin;
+			if (room) {
+				atomic64_cmpxchg(&afu->room, 0, room);
+				goto retry;
+			}
 			udelay(1 << nretry);
 		} while (nretry++ < MC_ROOM_RETRY_CNT);
 
@@ -346,7 +347,6 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
 		goto no_room;
 	}
 
-write_ioarrin:
 	writeq_be((u64)&cmd->rcb, &afu->host_map->ioarrin);
 out:
 	pr_devel("%s: cmd=%p len=%d ea=%p rc=%d\n", __func__, cmd,
@@ -2409,6 +2409,7 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	struct afu *afu = cfg->afu;
 	struct device *dev = &cfg->dev->dev;
 	int port;
+	u64 room;
 	ulong lock_flags;
 
 	/* Avoid MMIO if the device has failed */
@@ -2437,8 +2438,11 @@ static void cxlflash_worker_thread(struct work_struct *work)
 	}
 
 	if (afu->read_room) {
-		atomic64_set(&afu->room, readq_be(&afu->host_map->cmd_room));
-		afu->read_room = false;
+		room = readq_be(&afu->host_map->cmd_room);
+		if (room) {
+			atomic64_cmpxchg(&afu->room, 0, room);
+			afu->read_room = false;
+		}
 	}
 
 	spin_unlock_irqrestore(cfg->host->host_lock, lock_flags);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/14] cxlflash: Remove unused buffer from AFU command
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (3 preceding siblings ...)
  2016-11-15 23:14 ` [PATCH 04/14] cxlflash: Avoid command room violation Uma Krishnan
@ 2016-11-15 23:14 ` Uma Krishnan
  2016-11-15 23:14 ` [PATCH 06/14] cxlflash: Allocate memory instead of using command pool for AFU sync Uma Krishnan
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:14 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

The cxlflash driver originally required a per-command 4K buffer that
hosted data passed to the AFU. When the routines that initiate AFU
and internal SCSI commands were refactored to use scsi_execute(), the
need for this buffer became obsolete. As it is no longer necessary,
the buffer is removed.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  1 -
 drivers/scsi/cxlflash/main.c   | 28 ++--------------------------
 2 files changed, 2 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 6e68155..0744546 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -134,7 +134,6 @@ struct afu_cmd {
 	struct sisl_ioasa sa;	/* IOASA must follow IOARCB */
 	spinlock_t slock;
 	struct completion cevent;
-	char *buf;		/* per command buffer */
 	struct afu *parent;
 	int slot;
 	atomic_t free;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 1a32e8b..51ee9ef 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -41,8 +41,7 @@ MODULE_LICENSE("GPL");
  * Commands are checked out in a round-robin fashion. Note that since
  * the command pool is larger than the hardware queue, the majority of
  * times we will only loop once or twice before getting a command. The
- * buffer and CDB within the command are initialized (zeroed) prior to
- * returning.
+ * CDB within the command is initialized (zeroed) prior to returning.
  *
  * Return: The checked out command or NULL when command pool is empty.
  */
@@ -59,7 +58,6 @@ static struct afu_cmd *cmd_checkout(struct afu *afu)
 		if (!atomic_dec_if_positive(&cmd->free)) {
 			pr_devel("%s: returning found index=%d cmd=%p\n",
 				 __func__, cmd->slot, cmd);
-			memset(cmd->buf, 0, CMD_BUFSIZE);
 			memset(cmd->rcb.cdb, 0, sizeof(cmd->rcb.cdb));
 			return cmd;
 		}
@@ -615,17 +613,9 @@ static void cxlflash_wait_for_pci_err_recovery(struct cxlflash_cfg *cfg)
  */
 static void free_mem(struct cxlflash_cfg *cfg)
 {
-	int i;
-	char *buf = NULL;
 	struct afu *afu = cfg->afu;
 
 	if (cfg->afu) {
-		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-			buf = afu->cmd[i].buf;
-			if (!((u64)buf & (PAGE_SIZE - 1)))
-				free_page((ulong)buf);
-		}
-
 		free_pages((ulong)afu, get_order(sizeof(struct afu)));
 		cfg->afu = NULL;
 	}
@@ -874,7 +864,6 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 {
 	int rc = 0;
 	int i;
-	char *buf = NULL;
 	struct device *dev = &cfg->dev->dev;
 
 	/* AFU is ~12k, i.e. only one 64k page or up to four 4k pages */
@@ -889,20 +878,7 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 	cfg->afu->parent = cfg;
 	cfg->afu->afu_map = NULL;
 
-	for (i = 0; i < CXLFLASH_NUM_CMDS; buf += CMD_BUFSIZE, i++) {
-		if (!((u64)buf & (PAGE_SIZE - 1))) {
-			buf = (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
-			if (unlikely(!buf)) {
-				dev_err(dev,
-					"%s: Allocate command buffers fail!\n",
-				       __func__);
-				rc = -ENOMEM;
-				free_mem(cfg);
-				goto out;
-			}
-		}
-
-		cfg->afu->cmd[i].buf = buf;
+	for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
 		atomic_set(&cfg->afu->cmd[i].free, 1);
 		cfg->afu->cmd[i].slot = i;
 	}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/14] cxlflash: Allocate memory instead of using command pool for AFU sync
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (4 preceding siblings ...)
  2016-11-15 23:14 ` [PATCH 05/14] cxlflash: Remove unused buffer from AFU command Uma Krishnan
@ 2016-11-15 23:14 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 07/14] cxlflash: Use cmd_size for private commands Uma Krishnan
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:14 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

As staging for the removal of the AFU command pool, remove the reliance
upon the pool for the internal AFU sync command. Instead of obtaining an
AFU command from the pool, dynamically allocate memory with the appropriate
alignment requirements. Since the AFU sync service is only executed from
the process environment, blocking is acceptable.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 51ee9ef..816d85c 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1847,8 +1847,8 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	struct cxlflash_cfg *cfg = afu->parent;
 	struct device *dev = &cfg->dev->dev;
 	struct afu_cmd *cmd = NULL;
+	char *buf = NULL;
 	int rc = 0;
-	int retry_cnt = 0;
 	static DEFINE_MUTEX(sync_active);
 
 	if (cfg->state != STATE_NORMAL) {
@@ -1857,23 +1857,23 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	}
 
 	mutex_lock(&sync_active);
-retry:
-	cmd = cmd_checkout(afu);
-	if (unlikely(!cmd)) {
-		retry_cnt++;
-		udelay(1000 * retry_cnt);
-		if (retry_cnt < MC_RETRY_CNT)
-			goto retry;
-		dev_err(dev, "%s: could not get a free command\n", __func__);
+	buf = kzalloc(sizeof(*cmd) + __alignof__(*cmd) - 1, GFP_KERNEL);
+	if (unlikely(!buf)) {
+		dev_err(dev, "%s: no memory for command\n", __func__);
 		rc = -1;
 		goto out;
 	}
 
-	pr_debug("%s: afu=%p cmd=%p %d\n", __func__, afu, cmd, ctx_hndl_u);
+	cmd = (struct afu_cmd *)PTR_ALIGN(buf, __alignof__(*cmd));
+	init_completion(&cmd->cevent);
+	spin_lock_init(&cmd->slock);
+	cmd->parent = afu;
 
-	memset(cmd->rcb.cdb, 0, sizeof(cmd->rcb.cdb));
+	pr_debug("%s: afu=%p cmd=%p %d\n", __func__, afu, cmd, ctx_hndl_u);
 
 	cmd->rcb.req_flags = SISL_REQ_FLAGS_AFU_CMD;
+	cmd->rcb.ctx_id = afu->ctx_hndl;
+	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
 	cmd->rcb.port_sel = 0x0;	/* NA */
 	cmd->rcb.lun_id = 0x0;	/* NA */
 	cmd->rcb.data_len = 0x0;
@@ -1899,8 +1899,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 		rc = -1;
 out:
 	mutex_unlock(&sync_active);
-	if (cmd)
-		cmd_checkin(cmd);
+	kfree(buf);
 	pr_debug("%s: returning rc=%d\n", __func__, rc);
 	return rc;
 }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/14] cxlflash: Use cmd_size for private commands
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (5 preceding siblings ...)
  2016-11-15 23:14 ` [PATCH 06/14] cxlflash: Allocate memory instead of using command pool for AFU sync Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 08/14] cxlflash: Remove private command pool Uma Krishnan
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

Instead of using a private pool of AFU commands, use cmd_size to prime
the private pool of SCSI commands such that they are allocated with a
size large enough to contain an aligned AFU command. Use scsi_cmd_priv()
to derive the aligned/zeroed private command on queuecommand and TMF
paths. Remove cmd_checkout() as it is no longer required. The remaining
AFU private command infrastructure will be removed in a cleanup commit.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h | 14 +++++++++
 drivers/scsi/cxlflash/main.c   | 65 +++++++-----------------------------------
 2 files changed, 25 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 0744546..40507a9 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -19,6 +19,7 @@
 #include <linux/rwsem.h>
 #include <linux/types.h>
 #include <scsi/scsi.h>
+#include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_device.h>
 
 extern const struct file_operations cxlflash_cxl_fops;
@@ -146,6 +147,19 @@ struct afu_cmd {
 	 */
 } __aligned(cache_line_size());
 
+static inline struct afu_cmd *sc_to_afuc(struct scsi_cmnd *sc)
+{
+	return PTR_ALIGN(scsi_cmd_priv(sc), __alignof__(struct afu_cmd));
+}
+
+static inline struct afu_cmd *sc_to_afucz(struct scsi_cmnd *sc)
+{
+	struct afu_cmd *afuc = sc_to_afuc(sc);
+
+	memset(afuc, 0, sizeof(*afuc));
+	return afuc;
+}
+
 struct afu {
 	/* Stuff requiring alignment go first. */
 
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 816d85c..7fa27db 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -35,38 +35,6 @@ MODULE_AUTHOR("Matthew R. Ochs <mrochs@linux.vnet.ibm.com>");
 MODULE_LICENSE("GPL");
 
 /**
- * cmd_checkout() - checks out an AFU command
- * @afu:	AFU to checkout from.
- *
- * Commands are checked out in a round-robin fashion. Note that since
- * the command pool is larger than the hardware queue, the majority of
- * times we will only loop once or twice before getting a command. The
- * CDB within the command is initialized (zeroed) prior to returning.
- *
- * Return: The checked out command or NULL when command pool is empty.
- */
-static struct afu_cmd *cmd_checkout(struct afu *afu)
-{
-	int k, dec = CXLFLASH_NUM_CMDS;
-	struct afu_cmd *cmd;
-
-	while (dec--) {
-		k = (afu->cmd_couts++ & (CXLFLASH_NUM_CMDS - 1));
-
-		cmd = &afu->cmd[k];
-
-		if (!atomic_dec_if_positive(&cmd->free)) {
-			pr_devel("%s: returning found index=%d cmd=%p\n",
-				 __func__, cmd->slot, cmd);
-			memset(cmd->rcb.cdb, 0, sizeof(cmd->rcb.cdb));
-			return cmd;
-		}
-	}
-
-	return NULL;
-}
-
-/**
  * cmd_checkin() - checks in an AFU command
  * @cmd:	AFU command to checkin.
  *
@@ -232,7 +200,6 @@ static void cmd_complete(struct afu_cmd *cmd)
 			scp->result = (DID_OK << 16);
 
 		cmd_is_tmf = cmd->cmd_tmf;
-		cmd_checkin(cmd); /* Don't use cmd after here */
 
 		pr_debug_ratelimited("%s: calling scsi_done scp=%p result=%X "
 				     "ioasc=%d\n", __func__, scp, scp->result,
@@ -390,7 +357,7 @@ static void wait_resp(struct afu *afu, struct afu_cmd *cmd)
  */
 static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 {
-	struct afu_cmd *cmd;
+	struct afu_cmd *cmd = sc_to_afucz(scp);
 
 	u32 port_sel = scp->device->channel + 1;
 	short lflag = 0;
@@ -401,13 +368,6 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	int rc = 0;
 	ulong to;
 
-	cmd = cmd_checkout(afu);
-	if (unlikely(!cmd)) {
-		dev_err(dev, "%s: could not get a free command\n", __func__);
-		rc = SCSI_MLQUEUE_HOST_BUSY;
-		goto out;
-	}
-
 	/* When Task Management Function is active do not send another */
 	spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 	if (cfg->tmf_active)
@@ -419,6 +379,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
 	cmd->rcb.ctx_id = afu->ctx_hndl;
+	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
 	cmd->rcb.port_sel = port_sel;
 	cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
 
@@ -427,8 +388,10 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID |
 			      SISL_REQ_FLAGS_SUP_UNDERRUN | lflag);
 
-	/* Stash the scp in the reserved field, for reuse during interrupt */
+	/* Stash the scp in the command, for reuse during interrupt */
 	cmd->rcb.scp = scp;
+	cmd->parent = afu;
+	spin_lock_init(&cmd->slock);
 
 	/* Copy the CDB from the cmd passed in */
 	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
@@ -436,7 +399,6 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	/* Send the command */
 	rc = send_cmd(afu, cmd);
 	if (unlikely(rc)) {
-		cmd_checkin(cmd);
 		spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 		cfg->tmf_active = false;
 		spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
@@ -492,7 +454,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
 	struct afu *afu = cfg->afu;
 	struct device *dev = &cfg->dev->dev;
-	struct afu_cmd *cmd;
+	struct afu_cmd *cmd = sc_to_afucz(scp);
 	u32 port_sel = scp->device->channel + 1;
 	int nseg, i, ncount;
 	struct scatterlist *sg;
@@ -537,17 +499,11 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 		break;
 	}
 
-	cmd = cmd_checkout(afu);
-	if (unlikely(!cmd)) {
-		dev_err(dev, "%s: could not get a free command\n", __func__);
-		rc = SCSI_MLQUEUE_HOST_BUSY;
-		goto out;
-	}
-
 	kref_get(&cfg->afu->mapcount);
 	kref_got = 1;
 
 	cmd->rcb.ctx_id = afu->ctx_hndl;
+	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
 	cmd->rcb.port_sel = port_sel;
 	cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
 
@@ -561,6 +517,8 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 
 	/* Stash the scp in the reserved field, for reuse during interrupt */
 	cmd->rcb.scp = scp;
+	cmd->parent = afu;
+	spin_lock_init(&cmd->slock);
 
 	nseg = scsi_dma_map(scp);
 	if (unlikely(nseg < 0)) {
@@ -581,10 +539,8 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 
 	/* Send the command */
 	rc = send_cmd(afu, cmd);
-	if (unlikely(rc)) {
-		cmd_checkin(cmd);
+	if (unlikely(rc))
 		scsi_dma_unmap(scp);
-	}
 
 out:
 	if (kref_got)
@@ -2338,6 +2294,7 @@ static struct scsi_host_template driver_template = {
 	.change_queue_depth = cxlflash_change_queue_depth,
 	.cmd_per_lun = CXLFLASH_MAX_CMDS_PER_LUN,
 	.can_queue = CXLFLASH_MAX_CMDS,
+	.cmd_size = sizeof(struct afu_cmd) + __alignof__(struct afu_cmd) - 1,
 	.this_id = -1,
 	.sg_tablesize = 1,	/* No scatter gather support */
 	.max_sectors = CXLFLASH_MAX_SECTORS,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/14] cxlflash: Remove private command pool
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (6 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 07/14] cxlflash: Use cmd_size for private commands Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 09/14] cxlflash: Wait for active AFU commands to timeout upon tear down Uma Krishnan
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

Clean up and remove the remaining private command pool infrastructure
that is no longer required.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  7 -----
 drivers/scsi/cxlflash/main.c   | 68 ------------------------------------------
 2 files changed, 75 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 40507a9..465330f 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -136,8 +136,6 @@ struct afu_cmd {
 	spinlock_t slock;
 	struct completion cevent;
 	struct afu *parent;
-	int slot;
-	atomic_t free;
 
 	u8 cmd_tmf:1;
 
@@ -164,10 +162,6 @@ struct afu {
 	/* Stuff requiring alignment go first. */
 
 	u64 rrq_entry[NUM_RRQ_ENTRY];	/* 2K RRQ */
-	/*
-	 * Command & data for AFU commands.
-	 */
-	struct afu_cmd cmd[CXLFLASH_NUM_CMDS];
 
 	/* Beware of alignment till here. Preferably introduce new
 	 * fields after this point
@@ -189,7 +183,6 @@ struct afu {
 	bool read_room;
 	atomic64_t room;
 	u64 hb;
-	u32 cmd_couts;		/* Number of command checkouts */
 	u32 internal_lun;	/* User-desired LUN mode for this AFU */
 
 	char version[16];
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 7fa27db..a17bff2 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -35,33 +35,6 @@ MODULE_AUTHOR("Matthew R. Ochs <mrochs@linux.vnet.ibm.com>");
 MODULE_LICENSE("GPL");
 
 /**
- * cmd_checkin() - checks in an AFU command
- * @cmd:	AFU command to checkin.
- *
- * Safe to pass commands that have already been checked in. Several
- * internal tracking fields are reset as part of the checkin. Note
- * that these are intentionally reset prior to toggling the free bit
- * to avoid clobbering values in the event that the command is checked
- * out right away.
- */
-static void cmd_checkin(struct afu_cmd *cmd)
-{
-	cmd->rcb.scp = NULL;
-	cmd->rcb.timeout = 0;
-	cmd->sa.ioasc = 0;
-	cmd->cmd_tmf = false;
-	cmd->sa.host_use[0] = 0; /* clears both completion and retry bytes */
-
-	if (unlikely(atomic_inc_return(&cmd->free) != 1)) {
-		pr_err("%s: Freeing cmd (%d) that is not in use!\n",
-		       __func__, cmd->slot);
-		return;
-	}
-
-	pr_devel("%s: released cmd %p index=%d\n", __func__, cmd, cmd->slot);
-}
-
-/**
  * process_cmd_err() - command error handler
  * @cmd:	AFU command that experienced the error.
  * @scp:	SCSI command associated with the AFU command in error.
@@ -585,28 +558,12 @@ static void free_mem(struct cxlflash_cfg *cfg)
  *
  * Cleans up all state associated with the command queue, and unmaps
  * the MMIO space.
- *
- *  - complete() will take care of commands we initiated (they'll be checked
- *  in as part of the cleanup that occurs after the completion)
- *
- *  - cmd_checkin() will take care of entries that we did not initiate and that
- *  have not (and will not) complete because they are sitting on a [now stale]
- *  hardware queue
  */
 static void stop_afu(struct cxlflash_cfg *cfg)
 {
-	int i;
 	struct afu *afu = cfg->afu;
-	struct afu_cmd *cmd;
 
 	if (likely(afu)) {
-		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-			cmd = &afu->cmd[i];
-			complete(&cmd->cevent);
-			if (!atomic_read(&cmd->free))
-				cmd_checkin(cmd);
-		}
-
 		if (likely(afu->afu_map)) {
 			cxl_psa_unmap((void __iomem *)afu->afu_map);
 			afu->afu_map = NULL;
@@ -819,7 +776,6 @@ static void cxlflash_remove(struct pci_dev *pdev)
 static int alloc_mem(struct cxlflash_cfg *cfg)
 {
 	int rc = 0;
-	int i;
 	struct device *dev = &cfg->dev->dev;
 
 	/* AFU is ~12k, i.e. only one 64k page or up to four 4k pages */
@@ -833,12 +789,6 @@ static int alloc_mem(struct cxlflash_cfg *cfg)
 	}
 	cfg->afu->parent = cfg;
 	cfg->afu->afu_map = NULL;
-
-	for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-		atomic_set(&cfg->afu->cmd[i].free, 1);
-		cfg->afu->cmd[i].slot = i;
-	}
-
 out:
 	return rc;
 }
@@ -1468,13 +1418,6 @@ static void init_pcr(struct cxlflash_cfg *cfg)
 
 	/* Program the Endian Control for the master context */
 	writeq_be(SISL_ENDIAN_CTRL, &afu->host_map->endian_ctrl);
-
-	/* Initialize cmd fields that never change */
-	for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-		afu->cmd[i].rcb.ctx_id = afu->ctx_hndl;
-		afu->cmd[i].rcb.msi = SISL_MSI_RRQ_UPDATED;
-		afu->cmd[i].rcb.rrq = 0x0;
-	}
 }
 
 /**
@@ -1563,19 +1506,8 @@ static int init_global(struct cxlflash_cfg *cfg)
 static int start_afu(struct cxlflash_cfg *cfg)
 {
 	struct afu *afu = cfg->afu;
-	struct afu_cmd *cmd;
-
-	int i = 0;
 	int rc = 0;
 
-	for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
-		cmd = &afu->cmd[i];
-
-		init_completion(&cmd->cevent);
-		spin_lock_init(&cmd->slock);
-		cmd->parent = afu;
-	}
-
 	init_pcr(cfg);
 
 	/* After an AFU reset, RRQ entries are stale, clear them */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/14] cxlflash: Wait for active AFU commands to timeout upon tear down
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (7 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 08/14] cxlflash: Remove private command pool Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 10/14] cxlflash: Remove AFU command lock Uma Krishnan
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

With the removal of the static private command pool, the ability to
'complete' outstanding commands was lost. While not an issue for the
commands originating outside the driver, internal AFU commands are
synchronous and therefore have a timeout associated with them. To
avoid a stale memory access, the tear down sequence needs to ensure
that there are not any active commands before proceeding. As these
internal AFU commands are rare events, the simplest way to accomplish
this is detecting the activity and waiting for it to timeout.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h | 1 +
 drivers/scsi/cxlflash/main.c   | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 465330f..5e5651ba 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -181,6 +181,7 @@ struct afu {
 	u64 *hrrq_curr;
 	bool toggle;
 	bool read_room;
+	atomic_t cmds_active;	/* Number of currently active AFU commands */
 	atomic64_t room;
 	u64 hb;
 	u32 internal_lun;	/* User-desired LUN mode for this AFU */
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index a17bff2..f192b1c 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -556,7 +556,7 @@ static void free_mem(struct cxlflash_cfg *cfg)
  *
  * Safe to call with AFU in a partially allocated/initialized state.
  *
- * Cleans up all state associated with the command queue, and unmaps
+ * Waits for any active internal AFU commands to timeout and then unmaps
  * the MMIO space.
  */
 static void stop_afu(struct cxlflash_cfg *cfg)
@@ -564,6 +564,8 @@ static void stop_afu(struct cxlflash_cfg *cfg)
 	struct afu *afu = cfg->afu;
 
 	if (likely(afu)) {
+		while (atomic_read(&afu->cmds_active))
+			ssleep(1);
 		if (likely(afu->afu_map)) {
 			cxl_psa_unmap((void __iomem *)afu->afu_map);
 			afu->afu_map = NULL;
@@ -1745,6 +1747,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	}
 
 	mutex_lock(&sync_active);
+	atomic_inc(&afu->cmds_active);
 	buf = kzalloc(sizeof(*cmd) + __alignof__(*cmd) - 1, GFP_KERNEL);
 	if (unlikely(!buf)) {
 		dev_err(dev, "%s: no memory for command\n", __func__);
@@ -1786,6 +1789,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 		     (cmd->sa.host_use_b[0] & B_ERROR)))
 		rc = -1;
 out:
+	atomic_dec(&afu->cmds_active);
 	mutex_unlock(&sync_active);
 	kfree(buf);
 	pr_debug("%s: returning rc=%d\n", __func__, rc);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/14] cxlflash: Remove AFU command lock
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (8 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 09/14] cxlflash: Wait for active AFU commands to timeout upon tear down Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 11/14] cxlflash: Cleanup send_tmf() Uma Krishnan
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

The original design of the cxlflash driver required AFU commands
to convey state information across multiple threads. The IOASA
"host use" byte was used to track if a command was done, errored,
or timed out. A per-command spin lock was used to serialize access
to this byte. As this is no longer required with the introduction
of completions and various refactoring over time, the spin lock,
state tracking, and associated code can be removed. To support the
simplification, the wait_resp() routine is refactored to return a
success or failure. Additionally, as the simplification to the
AFU internal command routine, explicit assignments of AFU command
fields to zero are removed as the memory is zeroed upon allocation.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  8 +-------
 drivers/scsi/cxlflash/main.c   | 46 ++++++++++++++----------------------------
 2 files changed, 16 insertions(+), 38 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 5e5651ba..c4a7014 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -63,11 +63,6 @@ static inline void check_sizes(void)
 /* AFU defines a fixed size of 4K for command buffers (borrow 4K page define) */
 #define CMD_BUFSIZE     SIZE_4K
 
-/* flags in IOA status area for host use */
-#define B_DONE       0x01
-#define B_ERROR      0x02	/* set with B_DONE */
-#define B_TIMEOUT    0x04	/* set with B_DONE & B_ERROR */
-
 enum cxlflash_lr_state {
 	LINK_RESET_INVALID,
 	LINK_RESET_REQUIRED,
@@ -133,9 +128,8 @@ struct cxlflash_cfg {
 struct afu_cmd {
 	struct sisl_ioarcb rcb;	/* IOARCB (cache line aligned) */
 	struct sisl_ioasa sa;	/* IOASA must follow IOARCB */
-	spinlock_t slock;
-	struct completion cevent;
 	struct afu *parent;
+	struct completion cevent;
 
 	u8 cmd_tmf:1;
 
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index f192b1c..e5b5d8f 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -161,10 +161,6 @@ static void cmd_complete(struct afu_cmd *cmd)
 	struct cxlflash_cfg *cfg = afu->parent;
 	bool cmd_is_tmf;
 
-	spin_lock_irqsave(&cmd->slock, lock_flags);
-	cmd->sa.host_use_b[0] |= B_DONE;
-	spin_unlock_irqrestore(&cmd->slock, lock_flags);
-
 	if (cmd->rcb.scp) {
 		scp = cmd->rcb.scp;
 		if (unlikely(cmd->sa.ioasc))
@@ -204,21 +200,9 @@ static void context_reset(struct afu_cmd *cmd)
 	struct afu *afu = cmd->parent;
 	struct cxlflash_cfg *cfg = afu->parent;
 	struct device *dev = &cfg->dev->dev;
-	ulong lock_flags;
 
 	pr_debug("%s: cmd=%p\n", __func__, cmd);
 
-	spin_lock_irqsave(&cmd->slock, lock_flags);
-
-	/* Already completed? */
-	if (cmd->sa.host_use_b[0] & B_DONE) {
-		spin_unlock_irqrestore(&cmd->slock, lock_flags);
-		return;
-	}
-
-	cmd->sa.host_use_b[0] |= (B_DONE | B_ERROR | B_TIMEOUT);
-	spin_unlock_irqrestore(&cmd->slock, lock_flags);
-
 	writeq_be(rrin, &afu->host_map->ioarrin);
 	do {
 		rrin = readq_be(&afu->host_map->ioarrin);
@@ -303,20 +287,30 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
  * wait_resp() - polls for a response or timeout to a sent AFU command
  * @afu:	AFU associated with the host.
  * @cmd:	AFU command that was sent.
+ *
+ * Return:
+ *	0 on success, -1 on timeout/error
  */
-static void wait_resp(struct afu *afu, struct afu_cmd *cmd)
+static int wait_resp(struct afu *afu, struct afu_cmd *cmd)
 {
+	int rc = 0;
 	ulong timeout = msecs_to_jiffies(cmd->rcb.timeout * 2 * 1000);
 
 	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
-	if (!timeout)
+	if (!timeout) {
 		context_reset(cmd);
+		rc = -1;
+	}
 
-	if (unlikely(cmd->sa.ioasc != 0))
+	if (unlikely(cmd->sa.ioasc != 0)) {
 		pr_err("%s: CMD 0x%X failed, IOASC: flags 0x%X, afu_rc 0x%X, "
 		       "scsi_rc 0x%X, fc_rc 0x%X\n", __func__, cmd->rcb.cdb[0],
 		       cmd->sa.rc.flags, cmd->sa.rc.afu_rc, cmd->sa.rc.scsi_rc,
 		       cmd->sa.rc.fc_rc);
+		rc = -1;
+	}
+
+	return rc;
 }
 
 /**
@@ -364,7 +358,6 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	/* Stash the scp in the command, for reuse during interrupt */
 	cmd->rcb.scp = scp;
 	cmd->parent = afu;
-	spin_lock_init(&cmd->slock);
 
 	/* Copy the CDB from the cmd passed in */
 	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
@@ -491,7 +484,6 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	/* Stash the scp in the reserved field, for reuse during interrupt */
 	cmd->rcb.scp = scp;
 	cmd->parent = afu;
-	spin_lock_init(&cmd->slock);
 
 	nseg = scsi_dma_map(scp);
 	if (unlikely(nseg < 0)) {
@@ -1757,7 +1749,6 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 
 	cmd = (struct afu_cmd *)PTR_ALIGN(buf, __alignof__(*cmd));
 	init_completion(&cmd->cevent);
-	spin_lock_init(&cmd->slock);
 	cmd->parent = afu;
 
 	pr_debug("%s: afu=%p cmd=%p %d\n", __func__, afu, cmd, ctx_hndl_u);
@@ -1765,10 +1756,6 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	cmd->rcb.req_flags = SISL_REQ_FLAGS_AFU_CMD;
 	cmd->rcb.ctx_id = afu->ctx_hndl;
 	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
-	cmd->rcb.port_sel = 0x0;	/* NA */
-	cmd->rcb.lun_id = 0x0;	/* NA */
-	cmd->rcb.data_len = 0x0;
-	cmd->rcb.data_ea = 0x0;
 	cmd->rcb.timeout = MC_AFU_SYNC_TIMEOUT;
 
 	cmd->rcb.cdb[0] = 0xC0;	/* AFU Sync */
@@ -1782,11 +1769,8 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	if (unlikely(rc))
 		goto out;
 
-	wait_resp(afu, cmd);
-
-	/* Set on timeout */
-	if (unlikely((cmd->sa.ioasc != 0) ||
-		     (cmd->sa.host_use_b[0] & B_ERROR)))
+	rc = wait_resp(afu, cmd);
+	if (unlikely(rc))
 		rc = -1;
 out:
 	atomic_dec(&afu->cmds_active);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/14] cxlflash: Cleanup send_tmf()
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (9 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 10/14] cxlflash: Remove AFU command lock Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:15 ` [PATCH 12/14] cxlflash: Cleanup queuecommand() Uma Krishnan
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

The send_tmf() routine includes some copy/paste cruft that can be
removed as well as the setting of an AFU command-specific while
holding the tmf_slock. While not a bug, it is out of place and
should be shifted down alongside the other command initialization
statements for clarity.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index e5b5d8f..5acc3fe 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -324,12 +324,10 @@ static int wait_resp(struct afu *afu, struct afu_cmd *cmd)
  */
 static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 {
-	struct afu_cmd *cmd = sc_to_afucz(scp);
-
 	u32 port_sel = scp->device->channel + 1;
-	short lflag = 0;
 	struct Scsi_Host *host = scp->device->host;
 	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)host->hostdata;
+	struct afu_cmd *cmd = sc_to_afucz(scp);
 	struct device *dev = &cfg->dev->dev;
 	ulong lock_flags;
 	int rc = 0;
@@ -342,27 +340,21 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 						  !cfg->tmf_active,
 						  cfg->tmf_slock);
 	cfg->tmf_active = true;
-	cmd->cmd_tmf = true;
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
+	cmd->rcb.scp = scp;
+	cmd->parent = afu;
+	cmd->cmd_tmf = true;
+
 	cmd->rcb.ctx_id = afu->ctx_hndl;
 	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
 	cmd->rcb.port_sel = port_sel;
 	cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
-
-	lflag = SISL_REQ_FLAGS_TMF_CMD;
-
 	cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID |
-			      SISL_REQ_FLAGS_SUP_UNDERRUN | lflag);
-
-	/* Stash the scp in the command, for reuse during interrupt */
-	cmd->rcb.scp = scp;
-	cmd->parent = afu;
-
-	/* Copy the CDB from the cmd passed in */
+			      SISL_REQ_FLAGS_SUP_UNDERRUN |
+			      SISL_REQ_FLAGS_TMF_CMD);
 	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
 
-	/* Send the command */
 	rc = send_cmd(afu, cmd);
 	if (unlikely(rc)) {
 		spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 12/14] cxlflash: Cleanup queuecommand()
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (10 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 11/14] cxlflash: Cleanup send_tmf() Uma Krishnan
@ 2016-11-15 23:15 ` Uma Krishnan
  2016-11-15 23:16 ` [PATCH 13/14] cxlflash: Migrate IOARRIN specific routines to function pointers Uma Krishnan
  2016-11-15 23:16 ` [PATCH 14/14] cxlflash: Migrate scsi command pointer to AFU command Uma Krishnan
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:15 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

The queuecommand routine is disorganized where it populates the
private command and also contains some logic/statements that are
not needed given that cxlflash devices do not (and likely never
will) support scatter-gather.

Restructure the code to remove the unnecessary logic and create an
organized flow:

	handle state -> DMA map -> populate command -> send command

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 50 ++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 5acc3fe..d5b355c 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -413,11 +413,11 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	struct afu *afu = cfg->afu;
 	struct device *dev = &cfg->dev->dev;
 	struct afu_cmd *cmd = sc_to_afucz(scp);
+	struct scatterlist *sg = scsi_sglist(scp);
 	u32 port_sel = scp->device->channel + 1;
-	int nseg, i, ncount;
-	struct scatterlist *sg;
+	u16 req_flags = SISL_REQ_FLAGS_SUP_UNDERRUN;
 	ulong lock_flags;
-	short lflag = 0;
+	int nseg = 0;
 	int rc = 0;
 	int kref_got = 0;
 
@@ -460,45 +460,35 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	kref_get(&cfg->afu->mapcount);
 	kref_got = 1;
 
-	cmd->rcb.ctx_id = afu->ctx_hndl;
-	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
-	cmd->rcb.port_sel = port_sel;
-	cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
-
-	if (scp->sc_data_direction == DMA_TO_DEVICE)
-		lflag = SISL_REQ_FLAGS_HOST_WRITE;
-	else
-		lflag = SISL_REQ_FLAGS_HOST_READ;
+	if (likely(sg)) {
+		nseg = scsi_dma_map(scp);
+		if (unlikely(nseg < 0)) {
+			dev_err(dev, "%s: Fail DMA map!\n", __func__);
+			rc = SCSI_MLQUEUE_HOST_BUSY;
+			goto out;
+		}
 
-	cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID |
-			      SISL_REQ_FLAGS_SUP_UNDERRUN | lflag);
+		cmd->rcb.data_len = sg_dma_len(sg);
+		cmd->rcb.data_ea = sg_dma_address(sg);
+	}
 
-	/* Stash the scp in the reserved field, for reuse during interrupt */
 	cmd->rcb.scp = scp;
 	cmd->parent = afu;
 
-	nseg = scsi_dma_map(scp);
-	if (unlikely(nseg < 0)) {
-		dev_err(dev, "%s: Fail DMA map! nseg=%d\n",
-			__func__, nseg);
-		rc = SCSI_MLQUEUE_HOST_BUSY;
-		goto out;
-	}
+	cmd->rcb.ctx_id = afu->ctx_hndl;
+	cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
+	cmd->rcb.port_sel = port_sel;
+	cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
 
-	ncount = scsi_sg_count(scp);
-	scsi_for_each_sg(scp, sg, ncount, i) {
-		cmd->rcb.data_len = sg_dma_len(sg);
-		cmd->rcb.data_ea = sg_dma_address(sg);
-	}
+	if (scp->sc_data_direction == DMA_TO_DEVICE)
+		req_flags |= SISL_REQ_FLAGS_HOST_WRITE;
 
-	/* Copy the CDB from the scsi_cmnd passed in */
+	cmd->rcb.req_flags = req_flags;
 	memcpy(cmd->rcb.cdb, scp->cmnd, sizeof(cmd->rcb.cdb));
 
-	/* Send the command */
 	rc = send_cmd(afu, cmd);
 	if (unlikely(rc))
 		scsi_dma_unmap(scp);
-
 out:
 	if (kref_got)
 		kref_put(&afu->mapcount, afu_unmap);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 13/14] cxlflash: Migrate IOARRIN specific routines to function pointers
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (11 preceding siblings ...)
  2016-11-15 23:15 ` [PATCH 12/14] cxlflash: Cleanup queuecommand() Uma Krishnan
@ 2016-11-15 23:16 ` Uma Krishnan
  2016-11-15 23:16 ` [PATCH 14/14] cxlflash: Migrate scsi command pointer to AFU command Uma Krishnan
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:16 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

As staging for supporting hardware with a different queuing mechanism,
move the send_cmd() and context_reset() routines to function pointers
that are configured when the AFU is initialized. In addition, rename
the existing routines to better reflect the queue model they support.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  3 +++
 drivers/scsi/cxlflash/main.c   | 21 +++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index c4a7014..30c3cdb 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -161,6 +161,9 @@ struct afu {
 	 * fields after this point
 	 */
 
+	int (*send_cmd)(struct afu *, struct afu_cmd *);
+	void (*context_reset)(struct afu_cmd *);
+
 	/* AFU HW */
 	struct cxl_ioctl_start_work work;
 	struct cxlflash_afu_map __iomem *afu_map;	/* entire MMIO map */
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index d5b355c..7fc8843 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -188,12 +188,10 @@ static void cmd_complete(struct afu_cmd *cmd)
 }
 
 /**
- * context_reset() - timeout handler for AFU commands
+ * context_reset_ioarrin() - reset command owner context via IOARRIN register
  * @cmd:	AFU command that timed out.
- *
- * Sends a reset to the AFU.
  */
-static void context_reset(struct afu_cmd *cmd)
+static void context_reset_ioarrin(struct afu_cmd *cmd)
 {
 	int nretry = 0;
 	u64 rrin = 0x1;
@@ -217,14 +215,14 @@ static void context_reset(struct afu_cmd *cmd)
 }
 
 /**
- * send_cmd() - sends an AFU command
+ * send_cmd_ioarrin() - sends an AFU command via IOARRIN register
  * @afu:	AFU associated with the host.
  * @cmd:	AFU command to send.
  *
  * Return:
  *	0 on success, SCSI_MLQUEUE_HOST_BUSY on failure
  */
-static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
+static int send_cmd_ioarrin(struct afu *afu, struct afu_cmd *cmd)
 {
 	struct cxlflash_cfg *cfg = afu->parent;
 	struct device *dev = &cfg->dev->dev;
@@ -298,7 +296,7 @@ static int wait_resp(struct afu *afu, struct afu_cmd *cmd)
 
 	timeout = wait_for_completion_timeout(&cmd->cevent, timeout);
 	if (!timeout) {
-		context_reset(cmd);
+		afu->context_reset(cmd);
 		rc = -1;
 	}
 
@@ -355,7 +353,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 			      SISL_REQ_FLAGS_TMF_CMD);
 	memcpy(cmd->rcb.cdb, &tmfcmd, sizeof(tmfcmd));
 
-	rc = send_cmd(afu, cmd);
+	rc = afu->send_cmd(afu, cmd);
 	if (unlikely(rc)) {
 		spin_lock_irqsave(&cfg->tmf_slock, lock_flags);
 		cfg->tmf_active = false;
@@ -486,7 +484,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 	cmd->rcb.req_flags = req_flags;
 	memcpy(cmd->rcb.cdb, scp->cmnd, sizeof(cmd->rcb.cdb));
 
-	rc = send_cmd(afu, cmd);
+	rc = afu->send_cmd(afu, cmd);
 	if (unlikely(rc))
 		scsi_dma_unmap(scp);
 out:
@@ -1656,6 +1654,9 @@ static int init_afu(struct cxlflash_cfg *cfg)
 		goto err2;
 	}
 
+	afu->send_cmd = send_cmd_ioarrin;
+	afu->context_reset = context_reset_ioarrin;
+
 	pr_debug("%s: afu version %s, interface version 0x%llX\n", __func__,
 		 afu->version, afu->interface_version);
 
@@ -1747,7 +1748,7 @@ int cxlflash_afu_sync(struct afu *afu, ctx_hndl_t ctx_hndl_u,
 	*((__be16 *)&cmd->rcb.cdb[2]) = cpu_to_be16(ctx_hndl_u);
 	*((__be32 *)&cmd->rcb.cdb[4]) = cpu_to_be32(res_hndl_u);
 
-	rc = send_cmd(afu, cmd);
+	rc = afu->send_cmd(afu, cmd);
 	if (unlikely(rc))
 		goto out;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 14/14] cxlflash: Migrate scsi command pointer to AFU command
  2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
                   ` (12 preceding siblings ...)
  2016-11-15 23:16 ` [PATCH 13/14] cxlflash: Migrate IOARRIN specific routines to function pointers Uma Krishnan
@ 2016-11-15 23:16 ` Uma Krishnan
  13 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-15 23:16 UTC (permalink / raw)
  To: linux-scsi, James Bottomley, Martin K. Petersen, Matthew R. Ochs,
	Manoj N. Kumar
  Cc: Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard, Uma Krishnan

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>

Currently, when sending a SCSI command, the pointer is stored in a
reserved field of the AFU command descriptor for retrieval once the
SCSI command has completed. In order to support new descriptor formats
that make use of the reserved field, the pointer is migrated to outside
the descriptor where it can still be found during completion processing.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h  |  1 +
 drivers/scsi/cxlflash/main.c    | 10 +++++-----
 drivers/scsi/cxlflash/sislite.h |  2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index 30c3cdb..7a892c5 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -129,6 +129,7 @@ struct afu_cmd {
 	struct sisl_ioarcb rcb;	/* IOARCB (cache line aligned) */
 	struct sisl_ioasa sa;	/* IOASA must follow IOARCB */
 	struct afu *parent;
+	struct scsi_cmnd *scp;
 	struct completion cevent;
 
 	u8 cmd_tmf:1;
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 7fc8843..08ef68a 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -151,7 +151,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
  *
  * Prepares and submits command that has either completed or timed out to
  * the SCSI stack. Checks AFU command back into command pool for non-internal
- * (rcb.scp populated) commands.
+ * (cmd->scp populated) commands.
  */
 static void cmd_complete(struct afu_cmd *cmd)
 {
@@ -161,8 +161,8 @@ static void cmd_complete(struct afu_cmd *cmd)
 	struct cxlflash_cfg *cfg = afu->parent;
 	bool cmd_is_tmf;
 
-	if (cmd->rcb.scp) {
-		scp = cmd->rcb.scp;
+	if (cmd->scp) {
+		scp = cmd->scp;
 		if (unlikely(cmd->sa.ioasc))
 			process_cmd_err(cmd, scp);
 		else
@@ -340,7 +340,7 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
 	cfg->tmf_active = true;
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
 
-	cmd->rcb.scp = scp;
+	cmd->scp = scp;
 	cmd->parent = afu;
 	cmd->cmd_tmf = true;
 
@@ -470,7 +470,7 @@ static int cxlflash_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scp)
 		cmd->rcb.data_ea = sg_dma_address(sg);
 	}
 
-	cmd->rcb.scp = scp;
+	cmd->scp = scp;
 	cmd->parent = afu;
 
 	cmd->rcb.ctx_id = afu->ctx_hndl;
diff --git a/drivers/scsi/cxlflash/sislite.h b/drivers/scsi/cxlflash/sislite.h
index 347fc16..1a2d09c 100644
--- a/drivers/scsi/cxlflash/sislite.h
+++ b/drivers/scsi/cxlflash/sislite.h
@@ -72,7 +72,7 @@ struct sisl_ioarcb {
 	u16 timeout;		/* in units specified by req_flags */
 	u32 rsvd1;
 	u8 cdb[16];		/* must be in big endian */
-	struct scsi_cmnd *scp;
+	u64 reserved;		/* Reserved area */
 } __packed;
 
 struct sisl_rc {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE
  2016-11-15 23:13 ` [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE Uma Krishnan
@ 2016-11-17 19:20   ` Matthew R. Ochs
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew R. Ochs @ 2016-11-17 19:20 UTC (permalink / raw)
  To: Uma Krishnan
  Cc: linux-scsi, James Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard

> On Nov 15, 2016, at 5:13 PM, Uma Krishnan <ukrishn@linux.vnet.ibm.com> =
wrote:
>=20
> The following Oops is encountered when blk_mq is enabled with the
> cxlflash driver:
>=20
> [ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5]
> [ 2960.817309] NIP  __blk_mq_run_hw_queue+0x278/0x4c0
> [ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0
> [ 2960.817314] Call Trace:
> [ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable)
> [ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100
> [ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0
> [ 2960.817333] blk_mq_flush_plug_list+0x150/0x190
> [ 2960.817338] blk_flush_plug_list+0x11c/0x2b0
> [ 2960.817344] blk_finish_plug+0x58/0x80
> [ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0
> [ 2960.817352] force_page_cache_readahead+0x68/0xd0
> [ 2960.817356] generic_file_read_iter+0x43c/0x6a0
> [ 2960.817359] blkdev_read_iter+0x68/0xa0
> [ 2960.817361] __vfs_read+0x11c/0x180
> [ 2960.817364] vfs_read+0xa4/0x1c0
> [ 2960.817366] SyS_read+0x6c/0x110
> [ 2960.817369] system_call+0x38/0xb4
>=20
> The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero
> value with scsi_mq_setup_tags() allocating tags using sg_tablesize.
> The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize
> as the devices it supports are not capable of scatter gather. This
> mismatch of values results in the Oops above.
>=20
> To resolve this issue, sg_tablesize for cxlflash can simply be set
> to 1, a value which satisfies the constraints in cxlflash and the
> lack of support of SG_NONE in SCSI blk_mq.
>=20
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable()
  2016-11-15 23:14 ` [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable() Uma Krishnan
@ 2016-11-17 19:20   ` Matthew R. Ochs
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew R. Ochs @ 2016-11-17 19:20 UTC (permalink / raw)
  To: Uma Krishnan
  Cc: linux-scsi, James Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Christophe Lombard, Frederic Barrat, Ian Munsie,
	Andrew Donnellan, Brian King, linuxppc-dev

> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukrishn@linux.vnet.ibm.com> =
wrote:
>=20
> During test, the following crash was observed:
>=20
> [34538.981505] Faulting instruction address: 0xd000000007c9c870
> cpu 0x9: Vector: 300 (Data Access) at [c0000007f1e8f590]
>    pc: d000000007c9c870: cxlflash_restore_luntable+0x70/0x1d0 =
[cxlflash]
>    lr: d000000007c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 =
[cxlflash]
>    sp: c0000007f1e8f810
>   msr: 9000000100009033
>   dar: c00000171d637438
> dsisr: 40000000
>  current =3D 0xc0000007f1e43f90
>  paca    =3D 0xc000000007b25100   softe: 0        irq_happened: 0x01
>    pid   =3D 493, comm =3D eehd
> enter ? for help
> [c0000007f1e8f8a0] d000000007c940b0 init_afu+0xd60/0x1200 [cxlflash]
> [c0000007f1e8f9a0] d000000007c945a8 cxlflash_pci_slot_reset+0x58/0xe0 =
[cxlflash]
> [c0000007f1e8fa20] d00000000715f790 cxl_pci_slot_reset+0x230/0x340 =
[cxl]
> [c0000007f1e8fae0] c000000000040dd4 eeh_report_reset+0x144/0x180
> [c0000007f1e8fb20] c00000000003f708 eeh_pe_dev_traverse+0x98/0x170
> [c0000007f1e8fbb0] c000000000041618 =
eeh_handle_normal_event+0x328/0x410
> [c0000007f1e8fc30] c000000000041db8 eeh_handle_event+0x178/0x330
> [c0000007f1e8fce0] c000000000042118 eeh_event_handler+0x1a8/0x1b0
> [c0000007f1e8fd80] c00000000011420c kthread+0xec/0x100
> [c0000007f1e8fe30] c00000000000a47c ret_from_kernel_thread+0x5c/0xe0
>=20
> When superpipe mode is disabled for a LUN, the references for the
> local lun are deleted but the LUN is still identified as being present
> in the LUN table. This mismatched state can result in the above crash
> when the LUN table is restored during an error recovery operation.
>=20
> To fix this issue, the local LUN information structure is updated to
> reflect the LUN is no longer in the LUN table once all references to
> the LUN are gone.
>=20
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 03/14] cxlflash: Improve context_reset() logic
  2016-11-15 23:14 ` [PATCH 03/14] cxlflash: Improve context_reset() logic Uma Krishnan
@ 2016-11-17 19:21   ` Matthew R. Ochs
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew R. Ochs @ 2016-11-17 19:21 UTC (permalink / raw)
  To: Uma Krishnan
  Cc: linux-scsi, James Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard

> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukrishn@linux.vnet.ibm.com> =
wrote:
>=20
> Currently, the context reset routine waits for command room to
> be available before sending the reset request. Per review of the
> SISLite specification and clarifications from the CXL Flash AFU
> designers, this wait is unnecessary. The reset request can be
> sent anytime regardless of command room, so long as only a single
> reset request is active at any one point in time.
>=20
> This commit simplifies the reset routine by removing the wait for
> command room. Additionally it adds a debug trace to help pinpoint
> hardware errors when a context reset does not complete.
>=20
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/14] cxlflash: Avoid command room violation
  2016-11-15 23:14 ` [PATCH 04/14] cxlflash: Avoid command room violation Uma Krishnan
@ 2016-11-17 19:36   ` Matthew R. Ochs
  2016-11-17 22:30     ` Uma Krishnan
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew R. Ochs @ 2016-11-17 19:36 UTC (permalink / raw)
  To: Uma Krishnan
  Cc: linux-scsi, James Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Brian King, linuxppc-dev, Ian Munsie, Andrew Donnellan,
	Frederic Barrat, Christophe Lombard

Hi Uma,

I do see a potential hang issue with this patch. See my comments below.


-matt
=20
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukrishn@linux.vnet.ibm.com> =
wrote:
>=20
> During test, a command room violation interrupt is occasionally seen
> for the master context when the CXL flash devices are stressed.
>=20
> After studying the code, there could be gaps in the way command room
> value is being cached in cxlflash. When the cached command room is =
zero
> the thread attempting to send becomes burdened with updating the =
cached
> value with the actual value from the AFU. Today, this is handled with
> an atomic set operation of the raw value read. Following the atomic
> update, the thread proceeds to send.
>=20
> This behavior is incorrect on two counts:
>=20
>   - The update fails to take into account the current thread and its
>     consumption of one of the hardware commands.
>=20
>   - The update does not take into account other threads also =
atomically
>     updating. Per design, a worker thread updates the cached value =
when
>     a send thread times out. By not performing an atomic =
compare/exchange,
>     the cached value can be incorrectly clobbered.
>=20
> To correct these issues, the runtime updates of the cached command =
room
> are updated to use atomic64_cmpxchg() and the send routine is updated =
to
> take into account the current thread consuming a hardware command.
>=20
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
> ---
> drivers/scsi/cxlflash/main.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>=20
> diff --git a/drivers/scsi/cxlflash/main.c =
b/drivers/scsi/cxlflash/main.c
> index 6d33d8c..1a32e8b 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -322,9 +322,10 @@ static int send_cmd(struct afu *afu, struct =
afu_cmd *cmd)
> 	if (!newval) {

When this path is invoked, the current thread is consuming the last =
entry
available entry before the room must be read again. While the change
below is fine for circumstances where the hardware queue has room for
more than one command, consider a scenario where the queue has room
for only 1 command (the command that you just consumed via the atomic
but are not really consuming with a MMIO due to the revised goto).

In such a scenario this code would loop endlessly, bypassing the timeout
logic completely, until the read room reflected a value greater than 1.

> 		do {
> 			room =3D readq_be(&afu->host_map->cmd_room);
> -			atomic64_set(&afu->room, room);
> -			if (room)
> -				goto write_ioarrin;
> +			if (room) {
> +				atomic64_cmpxchg(&afu->room, 0, room);
> +				goto retry;
> +			}

If you instead fully consume the entry (goto write_ioarrin - similar as =
it was
before) and take into account the consumption when you update the cached
value (i.e.: cmpxchg(..., 0, room - 1) the scenario described above will =
not occur.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 04/14] cxlflash: Avoid command room violation
  2016-11-17 19:36   ` Matthew R. Ochs
@ 2016-11-17 22:30     ` Uma Krishnan
  0 siblings, 0 replies; 20+ messages in thread
From: Uma Krishnan @ 2016-11-17 22:30 UTC (permalink / raw)
  To: Matthew R. Ochs
  Cc: James Bottomley, linux-scsi, Martin K. Petersen, Frederic Barrat,
	Manoj N. Kumar, Ian Munsie, Andrew Donnellan, Brian King,
	linuxppc-dev, Christophe Lombard

Thanks for catching this Matt. Looking into this. Will send out a V2.

On 11/17/2016 1:36 PM, Matthew R. Ochs wrote:
> Hi Uma,
>
> I do see a potential hang issue with this patch. See my comments below.
>
>
> -matt
>
>> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukrishn@linux.vnet.ibm.com> wrote:
>>
>> During test, a command room violation interrupt is occasionally seen
>> for the master context when the CXL flash devices are stressed.
>>
>> After studying the code, there could be gaps in the way command room
>> value is being cached in cxlflash. When the cached command room is zero
>> the thread attempting to send becomes burdened with updating the cached
>> value with the actual value from the AFU. Today, this is handled with
>> an atomic set operation of the raw value read. Following the atomic
>> update, the thread proceeds to send.
>>
>> This behavior is incorrect on two counts:
>>
>>   - The update fails to take into account the current thread and its
>>     consumption of one of the hardware commands.
>>
>>   - The update does not take into account other threads also atomically
>>     updating. Per design, a worker thread updates the cached value when
>>     a send thread times out. By not performing an atomic compare/exchange,
>>     the cached value can be incorrectly clobbered.
>>
>> To correct these issues, the runtime updates of the cached command room
>> are updated to use atomic64_cmpxchg() and the send routine is updated to
>> take into account the current thread consuming a hardware command.
>>
>> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/cxlflash/main.c | 16 ++++++++++------
>> 1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
>> index 6d33d8c..1a32e8b 100644
>> --- a/drivers/scsi/cxlflash/main.c
>> +++ b/drivers/scsi/cxlflash/main.c
>> @@ -322,9 +322,10 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
>> 	if (!newval) {
>
> When this path is invoked, the current thread is consuming the last entry
> available entry before the room must be read again. While the change
> below is fine for circumstances where the hardware queue has room for
> more than one command, consider a scenario where the queue has room
> for only 1 command (the command that you just consumed via the atomic
> but are not really consuming with a MMIO due to the revised goto).
>
> In such a scenario this code would loop endlessly, bypassing the timeout
> logic completely, until the read room reflected a value greater than 1.
>
>> 		do {
>> 			room = readq_be(&afu->host_map->cmd_room);
>> -			atomic64_set(&afu->room, room);
>> -			if (room)
>> -				goto write_ioarrin;
>> +			if (room) {
>> +				atomic64_cmpxchg(&afu->room, 0, room);
>> +				goto retry;
>> +			}
>
> If you instead fully consume the entry (goto write_ioarrin - similar as it was
> before) and take into account the consumption when you update the cached
> value (i.e.: cmpxchg(..., 0, room - 1) the scenario described above will not occur.
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-11-17 22:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-15 23:12 [PATCH 00/14] cxlflash: Fixes, enhancements, cleanup and staging Uma Krishnan
2016-11-15 23:13 ` [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE Uma Krishnan
2016-11-17 19:20   ` Matthew R. Ochs
2016-11-15 23:14 ` [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable() Uma Krishnan
2016-11-17 19:20   ` Matthew R. Ochs
2016-11-15 23:14 ` [PATCH 03/14] cxlflash: Improve context_reset() logic Uma Krishnan
2016-11-17 19:21   ` Matthew R. Ochs
2016-11-15 23:14 ` [PATCH 04/14] cxlflash: Avoid command room violation Uma Krishnan
2016-11-17 19:36   ` Matthew R. Ochs
2016-11-17 22:30     ` Uma Krishnan
2016-11-15 23:14 ` [PATCH 05/14] cxlflash: Remove unused buffer from AFU command Uma Krishnan
2016-11-15 23:14 ` [PATCH 06/14] cxlflash: Allocate memory instead of using command pool for AFU sync Uma Krishnan
2016-11-15 23:15 ` [PATCH 07/14] cxlflash: Use cmd_size for private commands Uma Krishnan
2016-11-15 23:15 ` [PATCH 08/14] cxlflash: Remove private command pool Uma Krishnan
2016-11-15 23:15 ` [PATCH 09/14] cxlflash: Wait for active AFU commands to timeout upon tear down Uma Krishnan
2016-11-15 23:15 ` [PATCH 10/14] cxlflash: Remove AFU command lock Uma Krishnan
2016-11-15 23:15 ` [PATCH 11/14] cxlflash: Cleanup send_tmf() Uma Krishnan
2016-11-15 23:15 ` [PATCH 12/14] cxlflash: Cleanup queuecommand() Uma Krishnan
2016-11-15 23:16 ` [PATCH 13/14] cxlflash: Migrate IOARRIN specific routines to function pointers Uma Krishnan
2016-11-15 23:16 ` [PATCH 14/14] cxlflash: Migrate scsi command pointer to AFU command Uma Krishnan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).