All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>,
	Brian King <brking@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, Ian Munsie <imunsie@au1.ibm.com>,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Daniel Axtens <dja@ozlabs.au.ibm.com>
Subject: [PATCH v2 4/6] cxlflash: Fix to resolve cmd leak after host reset
Date: Mon, 14 Dec 2015 15:07:02 -0600	[thread overview]
Message-ID: <1450127222-48145-1-git-send-email-ukrishn@linux.vnet.ibm.com> (raw)
In-Reply-To: <1450126293-47440-1-git-send-email-ukrishn@linux.vnet.ibm.com>

From: Manoj Kumar <manoj@linux.vnet.ibm.com>

After a few iterations of resetting the card, either during EEH
recovery, or a host_reset the following is seen in the logs.
cxlflash 0008:00: cxlflash_queuecommand: could not get a free command

At every reset of the card, the commands that are outstanding are
being leaked.  No effort is being made to reap these commands.  A few
more resets later, the above error message floods the logs and the
card is rendered totally unusable as no free commands are available.

Iterated through the 'cmd' queue and printed out the 'free' counter
and found that on each reset certain commands were in-use and
stayed in-use through subsequent resets.

To resolve this issue, when the card is reset, reap all the commands
that are active/outstanding.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 35a3202..ac39856 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -632,15 +632,30 @@ static void free_mem(struct cxlflash_cfg *cfg)
  * @cfg:	Internal structure associated with the host.
  *
  * Safe to call with AFU in a partially allocated/initialized state.
+ *
+ * Cleans up all state associated with the command queue, and unmaps
+ * the MMIO space.
+ *
+ *  - complete() will take care of commands we initiated (they'll be checked
+ *  in as part of the cleanup that occurs after the completion)
+ *
+ *  - cmd_checkin() will take care of entries that we did not initiate and that
+ *  have not (and will not) complete because they are sitting on a [now stale]
+ *  hardware queue
  */
 static void stop_afu(struct cxlflash_cfg *cfg)
 {
 	int i;
 	struct afu *afu = cfg->afu;
+	struct afu_cmd *cmd;
 
 	if (likely(afu)) {
-		for (i = 0; i < CXLFLASH_NUM_CMDS; i++)
-			complete(&afu->cmd[i].cevent);
+		for (i = 0; i < CXLFLASH_NUM_CMDS; i++) {
+			cmd = &afu->cmd[i];
+			complete(&cmd->cevent);
+			if (!atomic_read(&cmd->free))
+				cmd_checkin(cmd);
+		}
 
 		if (likely(afu->afu_map)) {
 			cxl_psa_unmap((void __iomem *)afu->afu_map);
-- 
2.1.0


  parent reply	other threads:[~2015-12-14 21:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-14 20:51 [PATCH v2 0/6] cxlflash: Miscellaneous fixes and updates Uma Krishnan
2015-12-14 20:55 ` Uma Krishnan
2015-12-14 20:55   ` [PATCH v2 1/6] cxlflash: Fix to escalate LINK_RESET also on port 1 Uma Krishnan
2015-12-14 20:55 ` [PATCH v2 0/6] cxlflash: Miscellaneous fixes and updates Uma Krishnan
2015-12-14 20:55   ` [PATCH v2 2/6] cxlflash: Fix to avoid virtual LUN failover failure Uma Krishnan
2015-12-14 21:06 ` [PATCH v2 3/6] cxlflash: Removed driver date print Uma Krishnan
2015-12-14 23:45   ` Matthew R. Ochs
2015-12-15  0:20   ` Andrew Donnellan
2015-12-15 21:04   ` Manoj Kumar
2015-12-14 21:07 ` Uma Krishnan [this message]
2015-12-15  2:45   ` [PATCH v2 4/6] cxlflash: Fix to resolve cmd leak after host reset Andrew Donnellan
2015-12-14 21:07 ` [PATCH v2 5/6] cxlflash: Resolve oops in wait_port_offline Uma Krishnan
2015-12-14 21:07 ` [PATCH v2 6/6] cxlflash: Enable device id for future IBM CXL adapter Uma Krishnan
2015-12-15  3:11   ` Andrew Donnellan
2016-01-07  2:01 ` [PATCH v2 0/6] cxlflash: Miscellaneous fixes and updates Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1450127222-48145-1-git-send-email-ukrishn@linux.vnet.ibm.com \
    --to=ukrishn@linux.vnet.ibm.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=dja@ozlabs.au.ibm.com \
    --cc=imunsie@au1.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manoj@linux.vnet.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=mrochs@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.